Random number generator: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Sandy Harris
imported>Sandy Harris
(→‎Cryptography: remove simulation info)
Line 99: Line 99:
Cryptographic [[one-time pad]]s preferably are generated from physical random phenomena. For high-volume [[cipher]]s, however, reproducibility is needed, so, while the initialization variables tend to be truly random, a key generator for a [[stream cipher|stream]] or [[block cipher]] uses a demonstrably strong pseudo-random number generator.
Cryptographic [[one-time pad]]s preferably are generated from physical random phenomena. For high-volume [[cipher]]s, however, reproducibility is needed, so, while the initialization variables tend to be truly random, a key generator for a [[stream cipher|stream]] or [[block cipher]] uses a demonstrably strong pseudo-random number generator.


If it is desired to do repeated runs of a simulation, either for software testing or to evaluate variable components under repeatable condition, a PRNG that will reuse the same seed to produce the same sequence is useful.
==References==
==References==
{{reflist|2}}
{{reflist|2}}

Revision as of 23:01, 12 October 2008

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

While defining true randomness can drift from computer science and mathematics into philosophy, Bruce Schneier has a reasonable criterion for a source of random numbers: "It cannot be reliably reproduced."[1] This article focuses first on the idea of randomness, rather than on how the random or pseudo-random sequence is produced. For properly selected applications, it may be possible to be adequately random with a technique that does not depend on a true random physical process. In other cases, it may be practical to use a combination of physical and computational methods.

Choosing random quantities to foil a resourceful and motivated adversary is surprisingly difficult. — IETF Best Current Practice 106[2]

A truly random one-time pad may be generated with a combination of measurement of a random physical phenomenon (e.g., thermal noise or radioactive disintegrations), which is then captured by a computer and put into high-density storage. An expert in the physical phenomenon being measured needs to be consulted to determine if postprocessing is needed.

There are a wide range of applications for random or pseudo-random numbers, with various degrees of randomness. A computer game can give an apparently different scenario whenever played, with some simple fairly random physical inputs, such as selected bits from the computer's time of day clock, and, perhaps, the time between the last several keystrokes or mouse movements. In other cases, such as extremely critical cryptography, only the best uncorrelated physical random sequences will do. See related articles for examples of applications.

Random sequences from physical phenomena

Real-world physical phenomena often exhibit randomness, although some it is possible to be random in the short term, yet may exhibit some unexpected patterns or self-similarity. The number of cosmic rays striking an object at a given second is as random a process as is known. The number of disintegration events, in a radioactive material, is often considered random, but consider that if the amount of radioisotope is finite, as more and more half-lives pass, there will be fewer disintegrations. There is no simple answer to the question of whether a downward trend in disintegrations makes that sequence nonrandom. "Many natural sources of random bits may be defective in that they produce biased output bits (so that the probability of a one is different than the probability of a zero), or bits which are correlated with each other." [3]

There are, however, methods that can be used to postprocess such a source to remove the correlation.

One widespread technique is due to Von Neumann. Take the input bits in pairs, discard 11 or 00 pairs, output one if the input is 10 and zero if the input is 01. As long as the input bits are independent, this eliminates bias. For example, suppose the input is 90% ones. The chance of a 10 sequence is .9*.1 while that of a 01 sequence is .1*.9; the two are exactly equal so the output is unbiased. However, this does not hold if the input bits are correlated. Also, in the presence of large bias, the technique can be inefficient. In our example with 90% bias, 81% of pairs are 11 and 1% are 00; these are thrown away.

Another common technique is to apply a cryptographic hash to the input. Suppose we again have 90% bias and we estimate that there is at least .2 bits of randomness per input byte. Hash 800 input bytes and you should have 800*.2=160 bits of randomness. If your hash is SHA with 160-bit output, the whole hash output should be adequately random. In practice, one might hash 1000 bytes to give a safety margin, and use only part of the hash output as random output to avoid giving an enemy enough information to determine the internal state of the hash.

Manual generation

In the past, one-time pads were generated by typists who were told to type randomly, but, in practice, had patterns. Limited to manual methods, a not-unreasonable way is to put numbered balls in a bowl, having a blindfolded person take it out, the number recorded, the ball returned to the bowl, and the bowl shaken thorougly before taking the next ball.

This is a point mostly of historical interest, but typist patterns did cause weakness in some one-time pads. It is hard to imagine why manual generation would be useful today.

Pseudorandom number generators

In practical computing, it is convenient to use a pseudorandom number generator (PRNG), which produces a sufficiently varying sequence that, for example, a computer game driven with that generator appears to produce different characteristics each time that it is played. Nevertheless, most PRNGs that are operating system utilities, such as UNIX random(), will eventually repeat the sequence over a sufficiently long period of time. In addition, the programmer may have the choice of giving an explicit initialization value, which may be called a seed or a nonce, to the PRNG.

Most general-purpose PRNGs will produce the same sequence as long as they are given the same seed, which can even be useful for some software development purposes (see pitfalls in computer simulation, or for such things as being sure that a series of psychological research volunteers all see the same set of events. A given PRNG, however, may be told to use some reasonably random physical event in the computer as the seed, such that the seed is unpredictable.

Donald Knuth, an authority on PRNGs, quoted John von Neumann in a tongue-in-cheek but realistic view of the limits of PRNGs[4]:

Anyone who considers arithmetical method of producing random digits is, of course, in a state of sin — John von Neumann (1951)

It is possible to write very good and very bad pseudorandom number generators. Known pitfalls need to be avoided, both in initialization and in computing the next number.

Known bad ideas

Several common types of PRNG — such as linear congruential generators or linear feedback shift registers — cannot be used directly in cryptographic applications. With well-chosen parameters, they have reasonable statistical properties and are useful for such things as controlling Monte Carlo simulations. However, in the presence of an adversary — someone who wants to break your cryptosystem or cheat at your game — they are inadequate, easily broken. They might be used as components in some more complex generator, but they cannot safely be used alone in such applications.

"Cryptanalytic Attacks on Pseudorandom Number Generators" [1] by Kelsey, Schneier, Wagner, and Hall enumerates attacks and assesses vulnerabilities in widely used generators.

Some PRNGs

Linear congruential

Knuth illustrates the most commonly used PRNG, which he attributes to D.H. Lehmer in 1949[5].

A linear congurental generator creates successive outputs on using the formula:

on = a * on-1 + b  modulo c

Choice of the constants a, b, c and the seed o0 is critical; bad choices can give very weak generators.

OFB and CTR Sequences

Encryption devices can be used as practical and strong PRNGs, if the output applies some type of masking so an adversary cannot know the full internal state of the genertor.[6]

Blum Blum Shub

This is the strongest known generator, with the advantage of relative simplicity, but the disadvantage of being computationally intensive. If it is used to generate numbers that are not needed frequently, it may be useful.[7] [8]

The possibility also exists that one or more processors could be dedicated to BBS computation.

The possibility of pseudo-random methods that are adequately random

Some research indicates that in carefully selected situations, a pseudo-random number generator, which meets a number of cryptographic criteria, may be adequately random for security. The proofs are complex and not universally accepted, but may be true for a useful number of situations.[9] [3]

Random numbers that follow particular statistical distributions

In many applications, the exact sequence is desired to be as random as possible, but the sequence may be under additional constraints to meet specific statistical requirements.

See generating values for normal random variables for an example of generating random numbers that fit a desired distribution.

Testing for Randomness

It is entirely reasonable to have individual numbers recur in a random sequence — a random sequence may even have a "run" of a number such as 100 354 972 972 972 155 579. Indeed, when typists have been asked to type randomly, they may intuitively avoid runs and produce a less random sequence.

Knuth presents an extensive range of tests, of which a few are mentioned here. It is entirely reasonable to have individual numbers recur in a random sequence — a random sequence may even have a "run" of a number such as 100 354 972 972 972 155 579. Indeed, when typists have been asked to type randomly, they may intuitively avoid runs and produce a less random sequence.It is entirely reasonable to have individual numbers recur in a random sequence — a random sequence may even have a "run" of a number such as 100 354 972 972 972 155 579. Indeed, when typists have been asked to type randomly, they may intuitively avoid runs and produce a less random sequence.

Chi-squared statistic

Frequency test

Serial test

Gap test

Partition test

Applications

Computer simulation

Cryptography

Cryptographic one-time pads preferably are generated from physical random phenomena. For high-volume ciphers, however, reproducibility is needed, so, while the initialization variables tend to be truly random, a key generator for a stream or block cipher uses a demonstrably strong pseudo-random number generator.

References

  1. Schneier, Bruce (Second Edition, 1996), Applied Cryptography: Protocols, Algorithms, and Source Code in C, John Wiley & Sons p. 46
  2. Eastlake, D. 3rd; J. Schiller & S. Crocker (June 2005), Randomness Requirements for Security, RFC 4086; IETF Best Current Practice 106 p. 1
  3. 3.0 3.1 Goldwasser, Shafi & Mihir Bellare (July 2008), Chapter 3, Pseudo-random bit generators, Lecture Notes on Cryptography p. 39 Cite error: Invalid <ref> tag; name "Goldwasser2008" defined multiple times with different content
  4. Knuth, Donald (3rd Edition, 1998), Chapter III, Random Numbers, The Art of Computer Programming, Volume II: Seminumerical Algorithms, Addison-Wesley
  5. Knuth, p. 10
  6. Eastlake et al. p. 24
  7. Eastlake, p. 25
  8. Goldwasser & Ballare, p. 47
  9. Eastlake et al., pp. 23-27