Skip to content

Instantly share code, notes, and snippets.

@atoponce
Last active November 2, 2024 18:21
Show Gist options
  • Save atoponce/c594f6463cdf067ea8cabd899939c24f to your computer and use it in GitHub Desktop.
Save atoponce/c594f6463cdf067ea8cabd899939c24f to your computer and use it in GitHub Desktop.
Some solutions removing bias from loaded dice
#!/usr/bin/python3
import random
# Simple script to simulate biased throws of a single d6 die.
# bias should sum to 1
# pips ( 1, 2, 3, 4, 5, 6 )
BIAS = (0.125, 0.125, 0.25, 0.25, 0.125, 0.125)
def biasedRoll():
roll = random.random()
pips = 0
result = 1
for prob in BIAS:
pips += prob
if roll < pips:
return result
result += 1
if __name__ == "__main__":
throws = []
for i in range(1000):
throws.append(biasedRoll())
print(throws)

Efficiency comparison table

The tables below shows the biased dice rolls, and the efficiency of each method in returning unbiased results. Because multiple encodings exist to convert from binary to senary, multiple tables exist with the same rolls for comparison.

On average, encoding efficiency from least efficient to most efficient:

  1. Bit splitting (least efficient)
  2. Base conversion (most efficient)

On average, algorithmic efficiency from least efficient to most efficient:

  1. Parity (least efficient)
  2. Inequality 2
  3. Inequality 3
  4. Inequality 3-alt
  5. Hashing (most efficient)

Real world scenario

The die is biased as follows:

Pip Probability
1 1/8
2 1/8
3 1/4
4 1/4
5 1/8
6 1/8

Using Shannon entropy, I know that the die contains exactly 2.5-bits of entropy per roll. This is necessary for using cryptographic hashing as the randomness extractor.

For the table below, when reading the "Hashing (SHA-256)" column, the first n-bits are read from the resulting SHA-256 hash. For example:

65653 = 12.5-bits of entropy
SHA-256("65653")
= 729211836bff63aac924fe332bc9be837f19e125c8afa5ed11fc3dbddc14579e
= 0111001010010010000100011000001101101011111111110110001110101010\
  1100100100100100111111100011001100101011110010011011111010000011\
  0111111100011001111000010010010111001000101011111010010111101101\
  0001000111111100001111011011110111011100000101000101011110011110
= 011100101001 (truncated to 12 bits)

From there, "011100101001" is converted into senary either using bit splitting or base conversion.

Bit splitting

Each die toss here contains 2.5-bits of Shannon Entropy.

Biased Tosses (bits) Parity Ineq. 2 Ineq. 3 Ineq. 3 alt Hashing
65653 (12) None None None None 4562
433424 (15) None 6 None None 2226
3535361 (17) None 1 3 3 24454
32354341 (20) 5 2 6 None 15655
434634354 (22) None 5 42 2 113452
3263645464 (25) 6 None 323 64 2655315
14265464653 (27) None 2 26 5 24146362
166534415353 (30) 6 3 43 62 36312161
3544333341255 (32) None 3 21 55 14424136
43544533631561 (35) 33 6 33 13 5336516445
644361335114425 (37) 2 6 53 0 1546661246
2123344354336123 (40) 23 5 34 34 533416665661
44344654445441451 (42) 1 2 None 1 32445352332
645643343423313433 (45) 66 61 464 626 1414434166
4364345524352142541 (47) 3 2 3332 116 66122514316553
16336336323254534243 (50) 5 3 25 542 62534341112

Base conversion

Each die toss here contains 2.5-bits of Shannon Entropy.

Biased Tosses (bits) Parity Ineq. 2 Ineq. 3 Ineq. 3 alt Hashing
65653 (12) 1 4 None None 33364
433424 (15) 2 5 None 22 44624
3535361 (17) 2 1 3 5 411412
32354341 (20) 5 4 6 32 36162313
434634354 (22) 2 23 42 55 23363114
3263645464 (25) 6 62 323 222 523356324
14265464653 (27) 10 22 26 53 3256155644
166534415353 (30) 25 26 43 623 213116236316
3544333341255 (32) 1 3 21 511 2542144462445
43544533631561 (35) 41 236 33 52 23356216635321
644361335114425 (37) 2 252 53 1 5334246366664
2123344354336123 (40) 43 321 34 344 2241536442313536
44344654445441451 (42) 55 33 None 22 22311146366666521
645643343423313433 (45) 342 541 464 41156 4436335621561226
4364345524352142541 (47) 6 642 3332 4446 2113152414226635416
16336336323254534243 (50) 345 626 25 6224 22236641253211251442

How to convert a bitstring into base-6

There are a couple approaches that can be used for converting a bitstring into uniform d6 throws. One is via base conversion, explained here. The second is via bit splitting, and explained in another document.

Base Conversion

This will take place in two steps, for simplicity:

  1. Convert bitstring to decimal (base-10).
  2. Convert the decimal result to senary (base-6).

Binary to decimal conversion

Let's consider the following 8-bit string:

11010101

Each place in a bitstring represents a power-of-two. The smallest power of two, 2⁰ is on the furthest right, and the power increments by 1 for each place moving left. If a "1" is in that position, we multiply by that power-of-two.

1     1    0    1    0    1    0    1
2⁷  + 2⁶ + 2⁵ + 2⁴ + 2³ + 2² + 2¹ + 2⁰
--------------------------------------
128 + 64 + 0  + 16 + 0  + 4  + 0  + 1 = 213

Decimal to senary conversion

Now to convert to senary (base-6). We will divide by 6, taking the factor to the next step, while using the remainder to build our base-6 number.

213/6 = 35 rem. 3 -> ..3
 35/6 =  5 rem. 5 -> .53
  5/6 =  0 rem. 5 -> 553

Add 1 to each number, giving the result of "664" as your unbiased dice throws.

Acceptable bias

Astute mathematicians and cryptographers will note that a small bias exists in this method, as not every bitstring results in a range between 0 and 6₀6₁6₂...6ₙ.

For example, an 8-bit number has valid values 0 through 255 in base-10. However, in base-6, the valid range is 0 through 1103, not 0 through 5555. Thus 0, 1, 2, & 3 are more likely to be generated than 4 & 5. This may or may not be acceptable.

Just remember that a "fair d6" (non-precision die) purchased from a hobby shop is likely biased, in that not equal amounts of die material are removed from each face when drilling the pips, the density in the die may not be uniform, etc.

The bias exhibited in base conversion may be equivalent to non-precision dice purchased at hobby shops. However, if this makes you uncomfortable, use bit splitting instead, which is described in another document, and is unbiased.

Additional example

Another example, this time with a 32-bit string. First converting to base-10:

10111001111100101111101101111010

For compactness, we'll only sum the powers-of-two with a 1 in position, and ignore every 0:

2³¹+2²⁹+2²⁸+2²⁷+2²⁴+2²³+2²²+2²¹+2²⁰+2¹⁷+2¹⁵+2¹⁴+2¹³+2¹²+2¹¹+2⁹+2⁸+2⁶+2⁵+2⁴+2³+2¹
= 3119709050

Now converting to base-6:

3119709050/6 = 519951508 rem. 2 -> ............2
 519951508/6 =  86658584 rem. 4 -> ...........42
  86658584/6 =  14443097 rem. 2 -> ..........242
  14443097/6 =   2407182 rem. 6 -> .........5242
   2407182/6 =    401197 rem. 0 -> ........05242
    401197/6 =     66866 rem. 1 -> .......105242
     66866/6 =     11144 rem. 2 -> ......2105242
     11144/6 =      1857 rem. 2 -> .....22105242
      1857/6 =       309 rem. 3 -> ....322105242
       309/6 =        51 rem. 3 -> ...3322105242
        51/6 =         8 rem. 3 -> ..33322105242
         8/6 =         1 rem. 1 -> .133322105242
         1/6 =         0 rem. 1 -> 1133322105242

Add 1 to each digit, are your unbiased dice throws:

 1st dice throw: 1+1=2
 2nd dice throw: 1+1=2
 3rd dice throw: 3+1=4
 4th dice throw: 3+1=4
 5th dice throw: 3+1=4
 6th dice throw: 2+1=3
 7th dice throw: 2+1=3
 8th dice throw: 1+1=2
 9th dice throw: 0+1=1
10th dice throw: 5+1=6
11th dice throw: 2+1=3
12th dice throw: 4+1=5
13th dice throw: 2+1=3

You were able to get "2, 2, 4, 4, 4, 3, 3, 2, 1, 6, 3, 5, & 3" as unbiased throws.

How to convert a bitstring into base-6

There are a couple approaches that can be used for converting a bitstring into uniform d6 throws. One is via bit splitting, explained here. The second is via base conversion, and explained in another document.

Bit Splitting

This approach splits the bitstring into three consecutive non-overlapping bits until exhaustion, and each 3-bit string is handled separately. The trick to this approach is knowing when and how to accumulate bits when a 6 or 7 is encountered.

Example

Consider the following 8-bit string:

11010101

It is split as:

110 101 01

Binary to decimal conversion

Each place in a bitstring represents a power-of-two. The smallest power of two, 2⁰ is on the furthest right, and the power increments by 1 for each place moving left. If a "1" is in that position, we multiply by that power-of-two.

1    1    0
2² + 2¹ + 2⁰
------------
4  + 2  + 0 = 6

Thus, looking at the full string:

110 = 6
101 = 5
 01 (not enough bits)

0-7 are possible with three bits, but only 0-5 can be represented with a d6 (we'll add 1 later). As such, if a 6 or 7 is encountered, we need to handle those separately. So, because 6, 7 are only 2 possible exceptions, we'll assign 6 = 0 and 7 = 1. We will then append that new result to the end of our bitstring.

So, for example:

110 = 6. Convert 6 to 0.
101 = 5

Bit appending

The 01 at the end of the string are not enough bits for a d6, but we did just convert 6 to 0, so we can append the 0 to the end of 01, and use those three bits as our final dice throw:

010 = 2

Thus, our unbiased dice throws are:

1st dice throw: 5+1 = 6
2nd dice throw: 2+1 = 3

Another example

Another example, this time with a 32-bit string. First splitting it into 3-bit bitstrings:

10111001111100101111101101111010
(101)(110)(011)(111)(001)(011)(111)(011)(011)(110)(10)

Evaluating each 3-bit bitstring:

101 = 5
110 = 6 -> 0
011 = 3
111 = 7 -> 1
001 = 1
011 = 3
111 = 7 -> 1
011 = 3
011 = 3
110 = 6 -> 0
 10

We append our converted 6, 7 to our final 10, and continue:

100110 -> (100)(110)

100 = 4
110 = 6 -> 0

There aren't enough bits to append that final converted 6 to, so it's discarded. Thus, our unbiased dice throws are:

1st dice throw: 5+1=6
2nd dice throw: 3+1=4
3rd dice throw: 1+1=2
4th dice throw: 3+1=4
5th dice throw: 3+1=4
6th dice throw: 3+1=4
8th dice throw: 4+1=5

We were able to get "6, 4, 2, 4, 4, 4, & 5" as unbiased throws. Notice that we have fewer throws with this approach than the base conversion method mentioned above with the same bitstring (8 vs 12).

Cryptographic Hashing

Although uncommon, you may already be familiar with the bias of the die, and know the probability of each face. If the die bias is known, then we can using Shannon Entropy to calculate the entropy of the die. This is the most efficient of the approaches documented here.

The general approach would be:

  1. Use Shannon Entropy to calculate the entropy of the die.
  2. Make the necessary number of tosses.
  3. Hash with a cryptographic hashing funnction.

The drawback to this approach requires access to a computer with SHA-256 software installed. There are some websites online that can calculate SHA-256 hashes.

Shannon Entropy

The maximum Shannon Entropy of an unbiased 6-sided die in bits, is:

log₂(6) ≈ 2.58 bits.

Shannon Entropy is a measurement of unpredictability. As such, loaded dice will always have lower entropy, as their predictability is partially predictable.

If a loaded die has the following probabilities:

1: 12.5% (1/8)
2: 12.5% (1/8)
3: 25%   (1/4)
4: 25%   (1/4)
5: 12.5% (1/8)
6: 12.5% (1/8)

Then Shannon Entropy in bits is calculated as -Sum(pᵢ × log₂(pᵢ)):

  -1 × (1/8×log₂(1/8) + 1/8×log₂(1/8) + 1/4×log₂(1/4) + 1/4×log₂(1/4) + 1/8×log₂(1/8) + 1/8×log₂(1/8)) bits
= -1 × (-0.375 - 0.375 - 0.5 - 0.5 - 0.375 - 0.375) bits
= -1 × -2.5 bits
= 2.5 bits

Toss number

For 2.5 bits of entropy in the die, that means that each roll of the die is worth 2.5 bits of entropy. If you know you need 256-bits of entropy, then we can divide 256 by 2.5 to found out the necessary number of tosses to reach that required entropy:

256/2.5 = 102.4, or 103 tosses

Hashing the results

Use a 256-bit cryptographic hashing function, like SHA-256 or BLAKE2s to convert the biased dice throws in a binary bitstring (not hexadecimal, nor base-64). Do not add any additional characters (newlines, spaces, etc.).

For convenience, this CyberChef recipe is useful for hashing dice rolls with SHA-256 and splitting the string into bits. In the recipe, converting from binary to decimal is disabled, as not every required number of bits is divisible by 3. So, some manual inspection should be executed.

And this CyberChef recipe is useful for hashing dice rolls with SHA-256 and using base conversion.

Example

If the results are (biased to 3 and 4):

2,6,2,3,1,6,5,5,2,5,...,3,4,3,5,3,1,3,4,6,4 (103 total numbers)

Record SHA-256(results) as your uniform output.

SHA-256("2623165525...3435313464")
= 6bde1d162aea66e89814f6560398138e2153d6fe04fc738f53f6763b9a33d8c9
= 11010111101111000011101000101100010101011101010011001101110100010011\
  00000010100111101100101011000000011100110000001001110001110001000010\
  10100111101011011111110000001001111110001110011100011110101001111110\
  110011101100011101110011010001100111101100011001001

Use the encoding documentation on how to convert that bit string to senary (base-6) for usable dice throws.

Compare the equality of 2 face values

This is an application of John von Neumann randomness extraction. Credit to @Exaeta for this suggestion.

The approach compares the face values of a single die across two throws. If the first throw is less than the second, output a 0. If the second throw is greater than the second, output a 1. If the two throws are equal, try again.

Inequality mapping

Toss the loaded die twice: tosses "A" and "B". Compare the values of each toss. Results in binary.

A ? B result 1 die tossed twice
A = B discard 11, 22, 33, 44, 55, 66
A < B 0 12, 13, 14, 15, 16, 23, 24, 25, 26, 34, 35, 36, 45, 46, 56
A > B 1 21, 31, 32, 41, 42, 43, 51, 52, 53, 54, 61, 62, 63, 64, 65

Example

Using only a single loaded die (do not use multiple dice), we throw the die with the following results, with 20 tosses of a loaded die:

 1st toss = 3,  2nd toss = 4. 3 < 4 -> 0
 3rd toss = 6,  4th toss = 3. 6 > 3 -> 1
 5th toss = 3,  6th toss = 3. 3 = 3 -> discard
 7th toss = 3,  8th toss = 1. 3 > 1 -> 1
 9th toss = 5, 10th toss = 2. 5 > 2 -> 1
11th toss = 6, 12th toss = 3. 6 > 3 -> 1
13th toss = 2, 13th toss = 4. 2 < 4 -> 0
15th toss = 5, 16th toss = 5. 5 = 5 -> discard
17th toss = 4, 18th toss = 6. 4 < 6 -> 0
19th toss = 2, 20th toss = 4. 2 < 4 -> 0

The results in binary are 8 bits:

01111000

See the encoding documents on how to encode this number from binary to senary.

Compare the value of 3 face values

This is an application of John von Neumann randomness extraction, extended to three variables.

Toss the loaded die three times: tosses "A", "B", and "C". Compare the values of each toss. Results in senary.

Inequality mapping

A?B?C results 1 die tossed three times
A=B discard 11⋆, 22⋆, 33⋆, 44⋆, 55⋆, 66⋆
B=C discard ⋆11, ⋆22, ⋆33, ⋆44, ⋆55, ⋆66
A=C discard 1⋆1, 2⋆2, 3⋆3, 4⋆4, 5⋆5, 6⋆6
A<B<C 1 123, 124, 125, 126, 134, 135, 136, 145, 146, 156, 234, 235, 236, 245, 246, 256, 345, 346, 356, 456
A<C<B 2 132, 142, 143, 152, 153, 154, 162, 163, 164, 165, 243, 253, 254, 263, 264, 265, 354, 364, 365, 465
B<A<C 3 213, 214, 215, 216, 314, 315, 316, 324, 325, 326, 415, 416, 425, 426, 435, 436, 516, 526, 536, 546
B<C<A 4 312, 412, 413, 423, 512, 513, 514, 523, 524, 534, 612, 613, 614, 615, 623, 624, 625, 634, 635, 645
C<A<B 5 231, 241, 251, 261, 341, 342, 351, 352, 361, 362, 451, 452, 453, 461, 462, 463, 561, 562, 563, 564
C<B<A 6 321, 421, 431, 432, 521, 531, 532, 541, 542, 543, 621, 631, 632, 641, 642, 643, 651, 652, 653, 654

Example

Tossing a single biased die in sets of 3-tosses each:

 1st toss = 6,  2nd toss = 2,  3rd toss = 4. 624 (B < C < A): 4
 4th toss = 4,  5th toss = 3,  6th toss = 4. 434 (A = C):     discard
 7th toss = 2,  8th toss = 4,  9th toss = 6. 246 (A < B < C): 1
10th toss = 1, 11th toss = 6, 12th toss = 2. 162 (A < C < B): 2
13th toss = 1, 14th toss = 3, 15th toss = 4. 134 (A < B < C): 1
16th toss = 4, 17th toss = 2, 18th toss = 1. 421 (C < B < A): 6

The results in senary, and thus our unbiased die tosses are:

4, 1, 2, 1, 6

Further reading

This is an application of How to Turn Loaded Dice into Fair Coins (PDF) by Markus Jakobsson. Another application of this approach can be found at the Code crumbs blog by Clément Pit–Claudel.

Compare the equality of 3 face values - alternate approach

This is a more efficient application of John von Neumann randomness extraction with three variables. Credit to @int-e for this improvement over inequality-3.md. This approach will yield more entropy per toss, although it requires holding on to a larger mapping table.

Toss the loaded die three times: tosses "A", "B", and "C". Compare the values of each toss. Results are in binary.

Inequality mapping

A?B?C results 1 die tossed three times
A=B=C discard 111, 222, 333, 444, 555, 666
A=B<C X 112, 113, 114, 115, 116, 223, 224, 225, 226, 334, 335, 336, 445, 446, 556
A=B>C X 221, 331, 332, 441, 442, 443, 551, 552, 553, 554, 661, 662, 663, 664, 665
A=C<B Y 121, 131, 141, 151, 161, 232, 242, 252, 262, 343, 353, 363, 454, 464, 565
A=C>B Y 212, 313, 323, 414, 424, 434, 515, 525, 535, 545, 616, 626, 636, 646, 656
B=C<A Z 211, 311, 322, 411, 422, 433, 511, 522, 533, 544, 611, 622, 633, 644, 655
B=C>A Z 122, 133, 144, 155, 166, 233, 244, 255, 266, 344, 355, 366, 455, 466, 566
A<B<C 0,X 123, 124, 125, 126, 134, 135, 136, 145, 146, 156, 234, 235, 236, 245, 246, 256, 345, 346, 356, 456
A<C<B 1,X 132, 142, 143, 152, 153, 154, 162, 163, 164, 165, 243, 253, 254, 263, 264, 265, 354, 364, 365, 465
B<A<C 0,Y 213, 214, 215, 216, 314, 315, 316, 324, 325, 326, 415, 416, 425, 426, 435, 436, 516, 526, 536, 546
B<C<A 1,Y 312, 412, 413, 423, 512, 513, 514, 523, 524, 534, 612, 613, 614, 615, 623, 624, 625, 634, 635, 645
C<A<B 0,Z 231, 241, 251, 261, 341, 342, 351, 352, 361, 362, 451, 452, 453, 461, 462, 463, 561, 562, 563, 564
C<B<A 1,Z 321, 421, 431, 432, 521, 531, 532, 541, 542, 543, 621, 631, 632, 641, 642, 643, 651, 652, 653, 654

Results mapping:

result binary
XX 000
XY 001
XZ 010
YX 011
YY 100
YZ 101
ZX 110
ZY 111
ZZ discard

Example

Tossing a single biased die in sets of 3-tosses each:

 1st toss = 5,  2nd toss = 5,  3rd toss = 4. 554 (A = B > C) -> X
 4th toss = 3,  4th toss = 1,  6th toss = 1. 311 (B = C < A) -> Z
 7th toss = 6,  8th toss = 2,  9th toss = 5. 623 (B < C < A) -> 1,Y
10th toss = 2, 11th toss = 4, 12th toss = 2. 242 (A = C < B) -> Y
13th toss = 1, 14th toss = 4, 15th toss = 3. 143 (A < C < B) -> 1,X
16th toss = 2, 17th toss = 2, 18th toss = 1. 221 (A = B > C) -> X

Now that we have collected our results, we can convirt our bits 0, 1 and trits X, Y, Z using the results mapping table from above to create a binary bits and trits:

XZ1YY1X

Bits and trits parsing

The next step is to efficiently and uniformly convert our bits and trits into a bitstring for later encoding. To do this, maintaining order:

  1. Move every bit to the left
  2. Move every trit to the right

So continuing from our example above:

XZ1YY1X 

Maintaining order, move the bits to the left, and the trits to the right:

11XZYYX

Separate out the trit pairs per the mapping table above:

11 XZ YY X

Convert the trits to the binary bits:

11 010 100

The results in binary with 8-bits:

11010100

For one more example, this time without each step documented:

 XXZ11X0YXZ0X10X0000Z1XXZ1YXZ0Z01
 -> 110010000011001XXZXYXZXXZXXZYXZZ
 -> 110010000011001 XX  ZX  YX  ZX  XZ  XX  ZY  XZ  Z
 -> 110010000011001 000 110 011 110 010 000 111 010
 -> 110010000011001000110011110010000111010

See the encoding documents on how to encode this number from binary to senary.

Further reading

For even greater efficiency, @int-e extended this to 4 dice. The table becomes a bit unwieldy for manual parsing, but if those extra bits are that valuable to you, it may be worth looking into.

Compare the parity of 2 face values

This is the simplest, but least efficient application of John von Neumann randomness extraction algorithms documented here. The rest of the approaches use more complex algorithms to remove bias from a loaded die, but are more efficient in their approach.

The loss of efficiency will depend on how biased the die is, but it is not unreasonable to expect 80 initial die throws to get 7-8 unbiased results using this approach. If that's too inefficient for you, try some of the other approaches.

Toss the loaded die twice: tosses "A" and "B". Identify the parity of the toss- if it's an even number (2, 4, or 6) or odd (1, 3, or 5). Results are in binary.

Parity mapping

A, B result 1 die tossed twice
odd, odd discard 11, 13, 15, 31, 33, 35, 51, 53, 55
even, odd 0 21, 23, 25, 41, 43, 45, 61, 63, 65
odd, even 1 12, 14, 16, 32, 34, 36, 52, 54, 56
even, even discard 22, 24, 26, 42, 44, 46, 62, 64, 66

Further Reading

This approach came when reading crypto.stackexchange.com on John von Neumann randomness extraction. I stumbled upon this comment by fgrieu, and was curious about more efficient approaches, which started this whole series of documents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment