12345678901234567890123456789012345678901234567890123456789012345678901234567890

Overview of the Use of Cryptography in the PRNG
-----------------------------------------------

There are two main design points that have to be taken into consideration 
with this pseudo-random generator.  The first is that it relies as much on 
crytography as the quality of its inputs for its security and quality of 
data.  The second is that the output mechanism and entropy pools are as 
distinct as possible.

The operation of the PRNG is as follows:

Data with some arbitrary amount of entropy is passed to the input routine.  
This data is then hashed into the entropy pool, which is simply an SHA1 
hashing context.  Since every output bit relies on every input bit in this 
hash, we can say that the total entropy of the pool is equal to the sum of 
the entropies of the inputs, up to the 160 bits, which is the size of the pool.

Data is inputted into this pool from a variety of sources, which can include 
but are not limited to: keyboard event timing, mouse movement and mouse 
click timing, system usage statistics, and network usage statistics.  Each 
different source of entropy has its own counters for total input bytes and 
total bits of entropy.  It should be noted that the total bits of entropy is 
calculated in two different way.  The first is simply the sum of the 
estimates provided by the user.  The second is obtained through compressing 
the inputs, where it is assumed that the maximum entropy is approximately 
half the size of the compressed output.

Reseeding will take place when allowed by the user and when there is enough 
entropy.  A very conservative estimate of the total entropy is calculated by 
taking the smaller of the two estimates calculated as above for each sources, 
discarding an arbitrary number of the largest pools, and  then summing the rest.
When this value is above some threshold, reseeding is allowed. Both the threshold
and the number of sources discarded can be set by the programmer at compile time.

The user can also set the amount of time that they want the reseed to take.  
While this does not increase the complexity of an attack (unless you count 
the reseed time as a part of the state), it does make each attack more time-
comsuming.

The reseed is done by first churning the pool with output from the PRNG 
(described later). The hash of the pool is the completed, and the digest 
value is feed back into the pool. The digest of this hash is used to reseed 
the output generator.

This is done trivially by using the hash as a 20 byte secret IV. The first 20
bytes of output are simply the SHA1 digest of the IV.

Note that this is the only connection between the entropic inputs and output.

The output generator itself is relatively simple.  It consists simply of the 
above secret state (IV) and a 20 byte output buffer (initially created as
above). When those 20 bytes have been exhausted, the next 20 bytes are
generated by taking the digest of the concatenation of the IV and the previous
output bufer.  Backtracking is avoided by recreating the internal 
state based on the generator's own outputs after a fixed number of bytes 
outputted.  This number is currently 500, and can be set by the programmer.  
After the generator has outputted this number of bytes, it outputs 20 more 
bytes, which it then uses to create a new state for itself as described two 
paragraphs above.  Note that to backtrack through this change-over, an 
attacker would have to find the previous state of the generator based solely 
on the current state, which would require inverting the hash. This is 
completely out of the question with current published techniques.

The strength of the output relies on the quality of SHA1. It is known 
that SHA1 produces high quality "random-looking" outputs, and an attacker 
wanting to find the secret state would have to find the inverse to an 
arbitrary output from the one way function, which is fairly implausible. 
Note that any dictionary attacks are made more time-consuming by the fact 
that the inital SHA1 state is not the standard one is almost all cases. 

An attacker who somehow discovers the secret state of the generator will be 
able to do some damage. The backtracking range is limited as stated  above, 
and the users can set the rate at which to prevent bracktracking themselves, 
allowing them to make the usual time/security trade-offs. However, all of 
the future values of the output will be known to the attacker, up to the 
next reseed.

The biggest concern in the current design is the frequency with which reseed 
will be possible. For the suggested threshold value of 100 bits, only 12 
bytes of output are guaranteed to be absolutely secure under this system.  
If this much entropy can not be acquired quickly enough (remembering that we 
are using a very conservative estimate of our entropy), the outputted 
keys,hashes,etc. could possibly be attacked more efficiently by brute-force 
cracking the generator state.  The current assumption (read: hope) is that 
those who are demanding values from the PRNG at a high rate are also 
producing entropy at a similar rate, or will be willing to wait longer for 
their values and allow a slow poll. This will need to be examined in light 
of the results of the testing of the quality of our current entropy sources, 
which is still underway (more details upon request).