Last active
January 10, 2017 22:29
-
-
Save drfloob/6924928 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test | |
## Test Description | |
Craps is a series of rolls of 2 dice, where you win on the first roll | |
when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If | |
you roll any other number on the first roll, that is the point you | |
have to make on subsequent rolls. If you hit that number again, you | |
win. Otherwise, if you roll 7 you lose. | |
The total number of wins should approximately resemble a normal | |
distribution, and the number of rolls it takes to end a game should | |
match their respective simple probabilities. These are both tested | |
statistically, and the resulting p values are given. | |
## Example Program Output | |
``` | |
|-------------------------------------------------------------| | |
|This the CRAPS TEST. It plays 200,000 games of craps, counts| | |
|the number of wins and the number of throws necessary to end | | |
|each game. The number of wins should be (very close to) a | | |
|normal with mean 200000p and variance 200000p(1-p), and | | |
|p=244/495. Throws necessary to complete the game can vary | | |
|from 1 to infinity, but counts for all>21 are lumped with 21.| | |
|A chi-square test is made on the no.-of-throws cell counts. | | |
|Each 32-bit integer from the test file provides the value for| | |
|the throw of a die, by floating to [0,1), multiplying by 6 | | |
|and taking 1 plus the integer part of the result. | | |
|-------------------------------------------------------------| | |
RESULTS OF CRAPS TEST FOR bits.22 | |
No. of wins: Observed Expected | |
98332 98585.858586 | |
z-score=-1.135, pvalue=0.87190 | |
Analysis of Throws-per-Game: | |
Throws Observed Expected Chisq Sum of (O-E)^2/E | |
1 66910 66666.7 0.888 0.888 | |
2 37869 37654.3 1.224 2.112 | |
3 26834 26954.7 0.541 2.653 | |
4 19219 19313.5 0.462 3.115 | |
5 13753 13851.4 0.699 3.814 | |
6 9788 9943.5 2.433 6.247 | |
7 7137 7145.0 0.009 6.256 | |
8 5249 5139.1 2.351 8.608 | |
9 3604 3699.9 2.484 11.092 | |
10 2634 2666.3 0.391 11.483 | |
11 1968 1923.3 1.038 12.520 | |
12 1399 1388.7 0.076 12.596 | |
13 1027 1003.7 0.540 13.136 | |
14 712 726.1 0.275 13.412 | |
15 515 525.8 0.223 13.635 | |
16 335 381.2 5.588 19.223 | |
17 269 276.5 0.206 19.429 | |
18 216 200.8 1.146 20.575 | |
19 152 146.0 0.248 20.822 | |
20 106 106.2 0.000 20.823 | |
21 304 287.1 0.993 21.816 | |
Chisq= 21.82 for 20 degrees of freedom, p= 0.35058 | |
SUMMARY of craptest on bits.22 | |
p-value for no. of wins: 0.871897 | |
p-value for throws/game: 0.350580 | |
_____________________________________________________________ | |
``` | |
## Testing the Distribution of the Number of Wins | |
The Binomial Distribution models the total number of wins that should | |
occur in 200,000 games. The total probability of winning a game of | |
craps, based on simple probability and assuming true dice, is | |
`p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)). | |
Since the probability of winning is not far from 0.5, the normal | |
distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)). | |
With normality assumed, the expected (population) mean and standard | |
deviation are calculated. | |
The z-test measures how likely it is that the resulting number of wins | |
could be from a normal distribution, which it should be if the RNG is | |
unbiased (random). A p-value of 0.87 indicates a very strong fit | |
(using loose terms). In typical statistical analysis, you would | |
usually conclude that the outcome is not from a normal distribution if | |
the p value were less than 0.05 (called the significance level, where | |
p<0.05 is called "statistically significant"). `p=0.05` means you'd | |
expect to see an outcome this wild or wilder from a true normal | |
distribution about 5% of the time. | |
## Testing the Distribution of Individual Throws | |
The number of throws indicate how many rolls until the game ended, | |
*whether the game was won or lost*. Assuming true dice, the | |
probability of winning or losing at throw N can be calculated using | |
simple probability. | |
For example, the probability of ending a game on the first roll is | |
``` | |
P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3 | |
``` | |
With 200,000 games, we expect `200000/3=66666.7` games to finish on | |
the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html)) | |
The Chi-Squared test measures the "goodness of fit" for the actual | |
results with the expected results based on the above calculated | |
probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)). | |
Each "Throw count" result has its chi-squared test statistic | |
calculated (e.g. 0.888 for Throw count=1), and those numbers are | |
summed to evaluate a chi-squared test for the entire set of 200,000 | |
games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)). | |
If doing this by hand, you would take the chisq test statistic value | |
and degrees of freedom, look them up on a chisq table, and get a | |
p-value. The p-value is given here already. | |
Again, p<0.05 indicates a problem. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment