drfloob · January 10, 2017 22:29
diff --git a/Analysis of the Craps Test from the Diehard test suite b/Analysis of the Craps Test from the Diehard test suite
 # Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test

 ## Test Description

 Craps is a series of rolls of 2 dice, where you win on the first roll
 when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If
 you roll any other number on the first roll, that is the point you
 have to make on subsequent rolls. If you hit that number again, you
 win. Otherwise, if you roll 7 you lose.

 The total number of wins should approximately resemble a normal
 distribution, and the number of rolls it takes to end a game should
 match their respective simple probabilities. These are both tested
 statistically, and the resulting p values are given.


 ## Example Program Output

 ```
 	|-------------------------------------------------------------|
 	|This the CRAPS TEST.  It plays 200,000 games of craps, counts|
 	|the number of wins and the number of throws necessary to end |
 	|each game.  The number of wins should be (very close to) a   |
 	|normal with mean 200000p and variance 200000p(1-p), and      |
 	|p=244/495.  Throws necessary to complete the game can vary   |
 	|from 1 to infinity, but counts for all>21 are lumped with 21.|
 	|A chi-square test is made on the no.-of-throws cell counts.  |
 	|Each 32-bit integer from the test file provides the value for|
 	|the throw of a die, by floating to [0,1), multiplying by 6   |
 	|and taking 1 plus the integer part of the result.            |
 	|-------------------------------------------------------------|

 		RESULTS OF CRAPS TEST FOR bits.22 
 	No. of wins:  Observed	Expected
 	                 98332        98585.858586
 		z-score=-1.135, pvalue=0.87190

 	Analysis of Throws-per-Game:

 	Throws	Observed	Expected	Chisq	 Sum of (O-E)^2/E
 	1	66910		66666.7		0.888		0.888
 	2	37869		37654.3		1.224		2.112
 	3	26834		26954.7		0.541		2.653
 	4	19219		19313.5		0.462		3.115
 	5	13753		13851.4		0.699		3.814
 	6	9788		9943.5		2.433		6.247
 	7	7137		7145.0		0.009		6.256
 	8	5249		5139.1		2.351		8.608
 	9	3604		3699.9		2.484		11.092
 	10	2634		2666.3		0.391		11.483
 	11	1968		1923.3		1.038		12.520
 	12	1399		1388.7		0.076		12.596
 	13	1027		1003.7		0.540		13.136
 	14	712		726.1		0.275		13.412
 	15	515		525.8		0.223		13.635
 	16	335		381.2		5.588		19.223
 	17	269		276.5		0.206		19.429
 	18	216		200.8		1.146		20.575
 	19	152		146.0		0.248		20.822
 	20	106		106.2		0.000		20.823
 	21	304		287.1		0.993		21.816

 	Chisq=  21.82 for 20 degrees of freedom, p= 0.35058

 		SUMMARY of craptest on bits.22
 	 p-value for no. of wins: 0.871897
 	 p-value for throws/game: 0.350580
 	_____________________________________________________________

 ```

 ## Testing the Distribution of the Number of Wins

 The Binomial Distribution models the total number of wins that should
 occur in 200,000 games. The total probability of winning a game of
 craps, based on simple probability and assuming true dice, is
 `p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)). 
 Since the probability of winning is not far from 0.5, the normal
 distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)).

 With normality assumed, the expected (population) mean and standard
 deviation are calculated.

 The z-test measures how likely it is that the resulting number of wins
 could be from a normal distribution, which it should be if the RNG is
 unbiased (random). A p-value of 0.87 indicates a very strong fit
 (using loose terms). In typical statistical analysis, you would
 usually conclude that the outcome is not from a normal distribution if
 the p value were less than 0.05 (called the significance level, where
 p<0.05 is called "statistically significant"). `p=0.05` means you'd
 expect to see an outcome this wild or wilder from a true normal
 distribution about 5% of the time.

 ## Testing the Distribution of Individual Throws

 The number of throws indicate how many rolls until the game ended,
 *whether the game was won or lost*. Assuming true dice, the
 probability of winning or losing at throw N can be calculated using
 simple probability.

 For example, the probability of ending a game on the first roll is

 ```
 P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3
 ```

 With 200,000 games, we expect `200000/3=66666.7` games to finish on
 the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html))

 The Chi-Squared test measures the "goodness of fit" for the actual
 results with the expected results based on the above calculated
 probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)).

 Each "Throw count" result has its chi-squared test statistic
 calculated (e.g. 0.888 for Throw count=1), and those numbers are
 summed to evaluate a chi-squared test for the entire set of 200,000
 games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)).

 If doing this by hand, you would take the chisq test statistic value
 and degrees of freedom, look them up on a chisq table, and get a
 p-value. The p-value is given here already.

 Again, p<0.05 indicates a problem.
	# Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test

	## Test Description

	Craps is a series of rolls of 2 dice, where you win on the first roll
	when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If
	you roll any other number on the first roll, that is the point you
	have to make on subsequent rolls. If you hit that number again, you
	win. Otherwise, if you roll 7 you lose.

	The total number of wins should approximately resemble a normal
	distribution, and the number of rolls it takes to end a game should
	match their respective simple probabilities. These are both tested
	statistically, and the resulting p values are given.


	## Example Program Output

	```
	\|-------------------------------------------------------------\|
	\|This the CRAPS TEST. It plays 200,000 games of craps, counts\|
	\|the number of wins and the number of throws necessary to end \|
	\|each game. The number of wins should be (very close to) a \|
	\|normal with mean 200000p and variance 200000p(1-p), and \|
	\|p=244/495. Throws necessary to complete the game can vary \|
	\|from 1 to infinity, but counts for all>21 are lumped with 21.\|
	\|A chi-square test is made on the no.-of-throws cell counts. \|
	\|Each 32-bit integer from the test file provides the value for\|
	\|the throw of a die, by floating to [0,1), multiplying by 6 \|
	\|and taking 1 plus the integer part of the result. \|
	\|-------------------------------------------------------------\|

	RESULTS OF CRAPS TEST FOR bits.22
	No. of wins: Observed Expected
	98332 98585.858586
	z-score=-1.135, pvalue=0.87190

	Analysis of Throws-per-Game:

	Throws Observed Expected Chisq Sum of (O-E)^2/E
	1 66910 66666.7 0.888 0.888
	2 37869 37654.3 1.224 2.112
	3 26834 26954.7 0.541 2.653
	4 19219 19313.5 0.462 3.115
	5 13753 13851.4 0.699 3.814
	6 9788 9943.5 2.433 6.247
	7 7137 7145.0 0.009 6.256
	8 5249 5139.1 2.351 8.608
	9 3604 3699.9 2.484 11.092
	10 2634 2666.3 0.391 11.483
	11 1968 1923.3 1.038 12.520
	12 1399 1388.7 0.076 12.596
	13 1027 1003.7 0.540 13.136
	14 712 726.1 0.275 13.412
	15 515 525.8 0.223 13.635
	16 335 381.2 5.588 19.223
	17 269 276.5 0.206 19.429
	18 216 200.8 1.146 20.575
	19 152 146.0 0.248 20.822
	20 106 106.2 0.000 20.823
	21 304 287.1 0.993 21.816

	Chisq= 21.82 for 20 degrees of freedom, p= 0.35058

	SUMMARY of craptest on bits.22
	p-value for no. of wins: 0.871897
	p-value for throws/game: 0.350580
	_____________________________________________________________

	```

	## Testing the Distribution of the Number of Wins

	The Binomial Distribution models the total number of wins that should
	occur in 200,000 games. The total probability of winning a game of
	craps, based on simple probability and assuming true dice, is
	`p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)).
	Since the probability of winning is not far from 0.5, the normal
	distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)).

	With normality assumed, the expected (population) mean and standard
	deviation are calculated.

	The z-test measures how likely it is that the resulting number of wins
	could be from a normal distribution, which it should be if the RNG is
	unbiased (random). A p-value of 0.87 indicates a very strong fit
	(using loose terms). In typical statistical analysis, you would
	usually conclude that the outcome is not from a normal distribution if
	the p value were less than 0.05 (called the significance level, where
	p<0.05 is called "statistically significant"). `p=0.05` means you'd
	expect to see an outcome this wild or wilder from a true normal
	distribution about 5% of the time.

	## Testing the Distribution of Individual Throws

	The number of throws indicate how many rolls until the game ended,
	whether the game was won or lost. Assuming true dice, the
	probability of winning or losing at throw N can be calculated using
	simple probability.

	For example, the probability of ending a game on the first roll is

	```
	P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3
	```

	With 200,000 games, we expect `200000/3=66666.7` games to finish on
	the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html))

	The Chi-Squared test measures the "goodness of fit" for the actual
	results with the expected results based on the above calculated
	probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)).

	Each "Throw count" result has its chi-squared test statistic
	calculated (e.g. 0.888 for Throw count=1), and those numbers are
	summed to evaluate a chi-squared test for the entire set of 200,000
	games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)).

	If doing this by hand, you would take the chisq test statistic value
	and degrees of freedom, look them up on a chisq table, and get a
	p-value. The p-value is given here already.

	Again, p<0.05 indicates a problem.