Everything you need to know for testing ADAM with different popular RNG test suites.
Click on a test to learn more.
Status | Name | Description |
---|---|---|
✅ PASS | ent | Calculates various values for a supplied pseudo random sequence like entropy, arithmetic mean, Monte Carlo value for pi, and more. Included with the CLI tool |
✅ PASS | NIST STS | Set of statistical tests of randomness for generators intended for cryptographic applications. The general standard. |
✅ PASS | Dieharder | Cleaned up version of George Marsaglia's DIEHARD suite plus the STS tests from above and RGB tests. Mostly of historical interest. |
✅ PASS | testunif - gjrand | Extensive tests not within DIEHARD or NIST that analyze the randomness of bit sequences. Very particular and supposedly pretty tough! |
✅ PASS | testfunif - gjrand | Another series of tests specifically for analyzing the uniformity of double precision floating point numbers in [0.0, 1.0). |
✅ PASS | CACert | Cryptographic certificate provider that maintains test results of various RNGs. ADAM is among the top 8 crypto RNG entries which had a flawless test run. |
✅ PASS | Rabbit - TestU01 | For analyzing bitstreams. Treats input as one large binary sequence and applies 40 different statistical tests. |
✅ PASS | SmallCrush - TestU01 | 10 different tests. Small but effective at spotting deficiencies missed by other test suites. Lots of good generators fail this one. |
✅ PASS | Crush - TestU01 | 96 different tests. Very thorough and stringent, requires around 2³⁵ numbers in total. |
✅ PASS | BigCrush - TestU01 | 106 different tests. Even more stringent, requires "close to" 2³⁸ numbers in total. |
✅ PASS | PractRand | Final Boss - The most rigorous and effective test suite for PRNGs available. Can test unlimited lengths of sequences - ADAM has been tested up to 16TB! |
This is a collection of miscellaneous heuristics and empirical tests implemented by me, just for reporting some notable properties of an output sequence to the user. It's nowhere near as important as the feedback from tried and true testing suites, but it's good for some quick information about the sequences you generate.
Additionally, the ENT framework is integrated into this collection, for a total of 23 pieces of information that are returned to the user:
- Total sequences generated (also reported in terms of numbers generated per width)
- Monobit frequency (including bit runs and longest runs of 0s/1s)
- Seed & Nonce
- Range distribution
- Max and min values
- Parity: even and odd number ratio
- Runs: total # of runs and longest run (increasing/decreasing)
- Average gap length
- Most and least common bytes
- 64-bit floating point frequency + average FP value + distribution
- 64-bit floating point permutations + total permutations + standard deviation
- 32-bit floating point Max-of-8 test + total sets + most/least common position
- Entropy (ENT)
- Chi-square (ENT)
- Arithmetic mean (ENT)
- Monte Carlo value for pi (ENT)
- Serial correlation (ENT)
- 4-bit Saturation Point Test
- 8-bit Maurer Universal Test
- 16-bit Topological Binary Test
- 32-bit Von Neumann Successive Difference Test
- 64-bit SAC Test
- 128-bit Walsh-Hadamard Transform Test
The STS can be downloaded here. This is a good little quick start guide.
Results for testing 10MB, 100MB, 500MB, and 1GB are available in the tests
subdirectory. Sequences above 1GB weren't tested because based off my reading, the NIST STS struggles with performance and accuracy issues for very large sequences.
Some results for the NonOverlappingTemplate tests may be borderline - this is expected since the output should be random, which means some runs will probability wise be weaker than others. The overall is what's most important, and to confirm for yourself you can run the Dieharder test suite which includes the STS tests and retries any WEAK
results until a conclusive verdict can be reached.
NIST has created a manual detailing each of the different tests, available for free download.
adam -b | dieharder -a -Y 1 -k 2 -g 200
Explanation of options can be found in Dieharder's man page. I couldn't get the Dieharder binary working on macOS, so unfortunately this test may be limited to Linux users, unless you try using a Docker container (if I'm wrong please let me know).
First download gjrand here, and then come back to this section. I tested on gjrand.4.3.0.0 so that's the version I'll proceed with.
Using gjrand is easy but traversing it and figuring out what to build and how to build it is annoying so here's a quick little guide.
I recommend reading or skimming the README.txt
file at the top level of this directory. It helps you understand the results.
On Unix systems building the binary should just work, so run:
cd gjrand.4.3.0.0/testunif/src
./compile
cd ..
adam -b | ./mcp --<SIZE_OPTION>
where SIZE_OPTION is one of:
--tiny 10MB
--small 100MB
--standard 1GB
--big 10GB
--huge 100GB
--tera 1TB
--ten-tera 10TB
Results for sizes up to 100GB are available in the tests
subdirectory.
I recommend reading or skimming the README.txt
file at the top level of this directory.
To build and run:
cd gjrand.4.3.0.0/testfunif/src
./compile
cd ..
adam -f -a
File name: ftest
Output type (1 = ASCII, 2 = BINARY): 2
Sequence Size (x 1000): 1500
Scaling factor: 1
./mcpf --tiny < ftest
The size options are:
--tiny 1M doubles
--small 10M doubles
--standard 100M doubles (default)
--big 1G doubles
--huge 10G doubles
--even-huger 100G doubles
--tera 1T doubles.
Results for sizes up to 10G (10 billion) doubles are available in the tests
subdirectory.
Finally, here is a summary of the meaning of gjrand's P-values.
The one-sided P-value is supposed to be a "single figure of merit" that
summarises a test result for people in a hurry. It doesn't usually encapsulate
all the information found by a particular test. Instead it usually focuses on
the most common or most serious failure modes. Some tests produce several
other results before the P-value, and these are worth reading and
understanding if you really want to know what's going on. In many cases the
actual tests are two-sided. But the P-value reported will always be one-sided.
Some are not very accurate, but should be close enough to support the rough
guide to interpretation below.
Interpretation of one-sided P-values.
P > 0.1 : This P-value is completely ok. Don't try to compare
P-values in this range. "0.9 is higher than 0.4 so 0.9 must
be better." Wrong. "0.5 is better than 0.999 because 0.5 is
right in the middle of the range and therefore more random."
Wrong. Each one-sided P-value test was designed to return an
almost arbitrary value P > 0.1 if no bad behaviour was detected.
If P > 0.1, you know no bad behaviour was detected, and you shouldn't
try to interpret the result as anything more than that.
Of course, a good P value like this doesn't prove the generator is
good. It is always possible trouble will be spotted with a
different test, or more data.
P <= 0.1 : There might be some cause for concern. In this range,
the lower P is, the worse it is.
0.001 < P <= 0.1 : Not really proof that anything is wrong, but reason for
vague suspicion. Further testing with a different random seed
and perhaps larger data size is recommended. If a generator ALWAYS
produces P values in this range or lower for a particular test,
after many runs with different seeds, then it's probably broken.
1.0e-6 < P <= 0.001 : Like above, but moreso. Here you should start to worry.
1.0e-12 < P <= 1.0e-6 : There is a very remote possiblility that the
generator is ok, but you should be seriously worried. After you
see a P value this low, you can only rehabilitate the generator
if you run the test that failed for many different seeds and
it repeatedly produces good P-values with no more seriously bad ones.
P <= 1.0e-12 : You are most unlikely to see a good generator do this on a
good test even once in your lifetime. You would be safe to reject any
generator that does this on a properly calibrated test.
Note: For the next two test suites, I assume you've cloned the testingRNG repository.
To build from the root of testingRNG
:
cd testu01
make
You might get a linker error when trying to install the examples, but just scroll up and make sure you see something like "Libraries have been installed in ". Now there should be a folder named build
present.
Now, cd
back to build
in the ADAM directory:
cp ./libadam.a <TESTU01_PATH>/build/lib/libadam.a
cp ./adam.h <TESTU01_PATH>/build/include/adam.h
Where <TESTU01_PATH> is replaced with the path of the top-level TestU01 folder respectively on your system. Once you're done that, create a new C source file in <TESTU01_PATH>/build
with the following:
#include "include/TestU01.h"
#include "include/adam.h"
static adam_data data;
unsigned int adam_get(void)
{
return adam_int(data, UINT32);
}
int main()
{
data = adam_setup(NULL, NULL);
// Create TestU01 PRNG object for our generator
unif01_Gen *gen = unif01_CreateExternGenBits("adam", adam_get);
// Run the tests.
bbattery_Crush(gen);
// Clean up.
unif01_DeleteExternGenBits(gen);
adam_cleanup(data);
return 0;
}
TestU01 works with 32-bit data so we need to set the width parameter accordingly. Now compile and run (swap crush.c
with the name of your file):
gcc -std=c99 -Wall -O3 -o crush crush.c -Iinclude -Llib -ltestu01 -lprobdist -lmylib -lm -ladam -march=native
./crush
Crush should start running if you did everything correctly. It will take a while. You can swap out Crush with other batteries easily by changing the bbattery_Crush
function call to bbattery_SmallCrush
, bbattery_Rabbit
, etc. For more details about the test suite, consult the guide, which is also available in the doc
folder.
To build from the root of testingRNG
:
cd practrand
make
Then, assuming adam
is in your path (if not, copy it to the build folder and replace the adam
call below with ./adam
):
adam -b / ./RNG_test stdin8
Since ADAM's default work unit is 64-bits, stdin8
is the right size value to pass according to the documentation. If you'd like to try testing at other integer widths though, go ahead!
PractRand also includes a wealth of great documentation about its test results and thought processes. See any of the text files with the Test_
prefix to learn more about how the tests work, how the author determines their results, their specific grading system, and the author's thoughts on the other test suites listed here.