Go to the first, previous, next, last section, table of contents.
The library provides a large collection of random number generators which can be accessed through a uniform interface. Environment variables allow you to select different generators and seeds at runtime, so that you can easily switch between generators without needing to recompile your program. Each instance of a generator keeps track of its own state, allowing the generators to be used in multi-threaded programs. Additional functions are available for transforming uniform random numbers into samples from continuous or discrete probability distributions such as the Gaussian, log-normal or Poisson distributions.
These functions are declared in the header file `gsl_rng.h'.
In 1988, Park and Miller wrote a paper entitled "Random number generators: good ones are hard to find." [Commun. ACM, 31, 1192--1201]. Fortunately, some excellent random number generators are available, though poor ones are still in common use. You may be happy with the system-supplied random number generator on your computer, but you should be aware that as computers get faster, requirements on random number generators increase. Nowadays, a simulation that calls a random number generator millions of times can often finish before you can make it down the hall to the coffee machine and back.
A very nice review of random number generators was written by Pierre L'Ecuyer, as Chapter 4 of the book: Handbook on Simulation, Jerry Banks, ed. (Wiley, 1997). The chapter is available in postscript from from L'Ecuyer's ftp site (see references). Knuth's volume on Seminumerical Algorithms (originally published in 1968) devotes 170 pages to random number generators, and has recently been updated in its 3rd edition (1997). It is brilliant, a classic. If you don't own it, you should stop reading right now, run to the nearest bookstore, and buy it.
A good random number generator will satisfy both theoretical and statistical properties. Theoretical properties are often hard to obtain (they require real math!), but one prefers a random number generator with a long period, low serial correlation, and a tendency not to "fall mainly on the planes." Statistical tests are performed with numerical simulations. Generally, a random number generator is used to estimate some quantity for which the theory of probability provides an exact answer. Comparison to this exact answer provides a measure of "randomness".
It is important to remember that a random number generator is not a "real" function like sine or cosine. Unlike real functions, successive calls to a random number generator yield different return values. Of course that is just what you want for a random number generator, but to achieve this effect, the generator must keep track of some kind of "state" variable. Sometimes this state is just an integer (sometimes just the value of the previously generated random number), but often it is more complicated than that and may involve a whole array of numbers, possibly with some indices thrown in. To use the random number generators, you do not need to know the details of what comprises the state, and besides that varies from algorithm to algorithm.
The random number generator library uses two special structs,
gsl_rng_type
which holds static information about each type of
generator and gsl_rng
which describes an instance of a generator
created from a given gsl_rng_type
.
The functions described in this section are declared in the header file `gsl_rng.h'.
gsl_rng * r = gsl_rng_alloc (gsl_rng_taus);
If there is insufficient memory to create the generator then the
function returns a null pointer and the error handler is invoked with an
error code of GSL_ENOMEM
.
The generator is automatically initialized with the default seed,
gsl_rng_default_seed
. This is zero by default but can be changed
either directly or by using the environment variable GSL_RNG_SEED
(see section Random number environment variables).
The details of the available generator types are described later in this chapter.
ranlux
generator used a seed
of 314159265, and so choosing s equal to zero reproduces this when
using gsl_rng_ranlux
.
The following functions return uniformly distributed random numbers, either as integers or double precision floating point numbers. To obtain non-uniform distributions see section Random Number Distributions.
gsl_rng_max (r)
and gsl_rng_min (r)
.
gsl_rng_get(r)
by gsl_rng_max(r) + 1.0
in double
precision. Some generators compute this ratio internally so that they
can provide floating point numbers with more than 32 bits of randomness
(the maximum number of bits that can be portably represented in a single
unsigned long int
).
gsl_rng_uniform
until a non-zero value is obtained. You can use
this function if you need to avoid a singularity at 0.0.
If n is larger than the range of the generator then the function
calls the error handler with an error code of GSL_EINVAL
and
returns zero.
The following functions provide information about an existing generator. You should use them in preference to hard-coding the generator parameters into your own code.
printf("r is a '%s' generator\n", gsl_rng_name (r));
would print something like r is a 'taus' generator
.
gsl_rng_max
returns the largest value that gsl_rng_get
can return.
gsl_rng_min
returns the smallest value that gsl_rng_get
can return. Usually this value is zero. There are some generators with
algorithms that cannot return zero, and for these generators the minimum
value is 1.
void * state = gsl_rng_state (r); size_t n = gsl_rng_size (r); fwrite (state, n, 1, stream);
const gsl_rng_type **t, **t0; t0 = gsl_rng_types_setup (); printf("Available generators:\n"); for (t = t0; *t != 0; t++) { printf("%s\n", (*t)->name); }
The library allows you to choose a default generator and seed from the
environment variables GSL_RNG_TYPE
and GSL_RNG_SEED
and
the function gsl_rng_env_setup
. This makes it easy try out
different generators and seeds without having to recompile your program.
GSL_RNG_TYPE
and
GSL_RNG_SEED
and uses their values to set the corresponding
library variables gsl_rng_default
and
gsl_rng_default_seed
. These global variables are defined as
follows,
extern const gsl_rng_type *gsl_rng_default extern unsigned long int gsl_rng_default_seed
The environment variable GSL_RNG_TYPE
should be the name of a
generator, such as taus
or mt19937
. The environment
variable GSL_RNG_SEED
should contain the desired seed value. It
is converted to an unsigned long int
using the C library function
strtoul
.
If you don't specify a generator for GSL_RNG_TYPE
then
gsl_rng_mt19937
is used as the default. The initial value of
gsl_rng_default_seed
is zero.
Here is a short program which shows how to create a global
generator using the environment variables GSL_RNG_TYPE
and
GSL_RNG_SEED
,
#include <stdio.h> #include <gsl/gsl_rng.h> gsl_rng * r; /* global generator */ int main (void) { const gsl_rng_type * T; gsl_rng_env_setup(); T = gsl_rng_default; r = gsl_rng_alloc (T); printf("generator type: %s\n", gsl_rng_name (r)); printf("seed = %u\n", gsl_rng_default_seed); printf("first value = %u\n", gsl_rng_get (r)); return 0; }
Running the program without any environment variables uses the initial
defaults, an mt19937
generator with a seed of 0,
bash$ ./a.out generator type: mt19937 seed = 0 first value = 2867219139
By setting the two variables on the command line we can change the default generator and the seed,
bash$ GSL_RNG_TYPE="taus" GSL_RNG_SEED=123 ./a.out GSL_RNG_TYPE=taus GSL_RNG_SEED=123 generator type: taus seed = 123 first value = 2720986350
The above methods ignore the random number `state' which changes from call to call. It is often useful to be able to save and restore the state. To permit these practices, a few somewhat more advanced functions are supplied. These include:
stdout
. At the moment its only use is for debugging.
The functions described above make no reference to the actual algorithm used. This is deliberate so that you can switch algorithms without having to change any of your application source code. The library provides a large number of generators of different types, including simulation quality generators, generators provided for compatibility with other libraries and historical generators from the past.
The following generators are recommended for use in simulation. They have extremely long periods, low correlation and pass most statistical tests.
gsl_rng_set
reproduces this.
For more information see,
The generator gsl_rng_19937
uses the corrected version of the
seeding procedure published later by the two authors above. The
original seeding procedure suffered from low-order periodicity, but can
be used by selecting the alternate generator gsl_rng_mt19937_1998
.
The generator ranlxs0
is a second-generation version of the
RANLUX algorithm of L@"uscher, which produces "luxury random
numbers". This generator provides single precision output (24 bits) at
three luxury levels ranlxs0
, ranlxs1
and ranlxs2
.
It uses double-precision floating point arithmetic internally and can be
significantly faster than the integer version of ranlux
,
particularly on 64-bit architectures. The period of the generator is
about
10^171. The algorithm has mathematically proven properties and
can provide truly decorrelated numbers at a known level of randomness.
The higher luxury levels provide additional decorrelation between samples
as an additional safety margin.
These generators produce double precision output (48 bits) from the
RANLXS generator. The library provides two luxury levels
ranlxd1
and ranlxd2
.
The ranlux
generator is an implementation of the original
algorithm developed by L@"uscher. It uses a
lagged-fibonacci-with-skipping algorithm to produce "luxury random
numbers". It is a 24-bit generator, originally designed for
single-precision IEEE floating point numbers. This implementation is
based on integer arithmetic, while the second-generation versions
RANLXS and RANLXD described above provide floating-point
implementations which will be faster on many platforms.
The period of the generator is about
10^171. The algorithm has mathematically proven properties and
it can provide truly decorrelated numbers at a known level of
randomness. The default level of decorrelation recommended by L@"uscher
is provided by gsl_rng_ranlux
, while gsl_rng_ranlux389
gives the highest level of randomness, with all 24 bits decorrelated.
Both types of generator use 24 words of state per generator.
For more information see,
z_n = (x_n - y_n) mod m_1
where the two underlying generators x_n and y_n are,
x_n = (a_1 x_{n-1} + a_2 x_{n-2} + a_3 x_{n-3}) mod m_1 y_n = (b_1 y_{n-1} + b_2 y_{n-2} + b_3 y_{n-3}) mod m_2
with coefficients a_1 = 0, a_2 = 63308, a_3 = -183326, b_1 = 86098, b_2 = 0, b_3 = -539608, and moduli m_1 = 2^31 - 1 = 2147483647 and m_2 = 2145483479.
The period of this generator is 2^205 (about 10^61). It uses 6 words of state per generator. For more information see,
x_n = (a_1 x_{n-1} + a_5 x_{n-5}) mod m
with a_1 = 107374182, a_2 = a_3 = a_4 = 0, a_5 = 104480 and m = 2^31 - 1.
The period of this generator is about 10^46. It uses 5 words of state per generator. More information can be found in the following paper,
x_n = (s1_n ^^ s2_n ^^ s3_n)
where,
s1_{n+1} = (((s1_n&4294967294)<<12)^^(((s1_n<<13)^^s1_n)>>19)) s2_{n+1} = (((s2_n&4294967288)<< 4)^^(((s2_n<< 2)^^s2_n)>>25)) s3_{n+1} = (((s3_n&4294967280)<<17)^^(((s3_n<< 3)^^s3_n)>>11))
computed modulo
2^32. In the formulas above
^^
denotes "exclusive-or". Note that the algorithm relies on the properties
of 32-bit unsigned integers and has been implemented using a bitmask
of 0xFFFFFFFF
to make it work on 64 bit machines.
The period of this generator is 2^88 (about 10^26). It uses 3 words of state per generator. For more information see,
gfsr4
generator is like a lagged-fibonacci generator, and
produces each number as an xor
'd sum of four previous values.
r_n = r_{n-A} ^^ r_{n-B} ^^ r_{n-C} ^^ r_{n-D}
Ziff (ref below) notes that "it is now widely known" that two-tap registers (such as R250, which is described below) have serious flaws, the most obvious one being the three-point correlation that comes from the definition of the generator. Nice mathematical properties can be derived for GFSR's, and numerics bears out the claim that 4-tap GFSR's with appropriately chosen offsets are as random as can be measured, using the author's test.
This implementation uses the values suggested the the example on p392 of Ziff's article: A=471, B=1586, C=6988, D=9689.
If the offsets are appropriately chosen (such the one ones in this implementation), then the sequence is said to be maximal. I'm not sure what that means, but I would guess that means all states are part of the same cycle, which would mean that the period for this generator is astronomical; it is (2^K)^D \approx 10^{93334} where K=32 is the number of bits in the word, and D is the longest lag. This would also mean that any one random number could easily be zero; ie 0 <= r < 2^32.
Ziff doesn't say so, but it seems to me that the bits are completely independent here, so one could use this as an efficient bit generator; each number supplying 32 random bits. The quality of the generated bits depends on the underlying seeding procedure, which may need to be improved in some circumstances.
For more information see,
The standard Unix random number generators rand
, random
and rand48
are provided as part of GSL. Although these
generators are widely available individually often they aren't all
available on the same platform. This makes it difficult to write
portable code using them and so we have included the complete set of
Unix generators in GSL for convenience. Note that these generators
don't produce high-quality randomness and aren't suitable for work
requiring accurate statistics. However, if you won't be measuring
statistical quantities and just want to introduce some variation into
your program then these generators are quite acceptable.
rand()
generator. Its sequence is
x_{n+1} = (a x_n + c) mod m
with a = 1103515245, c = 12345 and m = 2^31. The seed specifies the initial value, x_1. The period of this generator is 2^31, and it uses 1 word of storage per generator.
random()
family of functions, a
set of linear feedback shift register generators originally used in BSD
Unix. There are several versions of random()
in use today: the
original BSD version (e.g. on SunOS4), a libc5 version (found on
older GNU/Linux systems) and a glibc2 version. Each version uses a
different seeding procedure, and thus produces different sequences.
The original BSD routines accepted a variable length buffer for the
generator state, with longer buffers providing higher-quality
randomness. The random()
function implemented algorithms for
buffer lengths of 8, 32, 64, 128 and 256 bytes, and the algorithm with
the largest length that would fit into the user-supplied buffer was
used. To support these algorithms additional generators are available
with the following names,
gsl_rng_random8_bsd gsl_rng_random32_bsd gsl_rng_random64_bsd gsl_rng_random128_bsd gsl_rng_random256_bsd
where the numeric suffix indicates the buffer length. The original BSD
random
function used a 128-byte default buffer and so
gsl_rng_random_bsd
has been made equivalent to
gsl_rng_random128_bsd
. Corresponding versions of the libc5
and glibc2
generators are also available, with the names
gsl_rng_random8_libc5
, gsl_rng_random8_glibc2
, etc.
rand48
generator. Its sequence is
x_{n+1} = (a x_n + c) mod m
defined on 48-bit unsigned integers with
a = 25214903917,
c = 11 and
m = 2^48.
The seed specifies the upper 32 bits of the initial value, x_1,
with the lower 16 bits set to 0x330E
. The function
gsl_rng_get
returns the upper 32 bits from each term of the
sequence. This does not have a direct parallel in the original
rand48
functions, but forcing the result to type long int
reproduces the output of mrand48
. The function
gsl_rng_uniform
uses the full 48 bits of internal state to return
the double precision number x_n/m, which is equivalent to the
function drand48
. Note that some versions of the GNU C Library
contained a bug in mrand48
function which caused it to produce
different results (only the lower 16-bits of the return value were set).
The following generators are provided for compatibility with
Numerical Recipes. Note that the original Numerical Recipes
functions used single precision while we use double precision. This will
lead to minor discrepancies, but only at the level of single-precision
rounding error. If necessary you can force the returned values to single
precision by storing them in a volatile float
, which prevents the
value being held in a register with double or extended precision. Apart
from this difference the underlying algorithms for the integer part of
the generators are the same.
ran0
implements Park and Miller's MINSTD
algorithm with a modified seeding procedure.
ran1
implements Park and Miller's MINSTD
algorithm with a 32-element Bayes-Durham shuffle box.
ran2
implements a L'Ecuyer combined recursive
generator with a 32-element Bayes-Durham shuffle-box.
ran3
implements Knuth's portable
subtractive generator.
The generators in this section are provided for compatibility with existing libraries. If you are converting an existing program to use GSL then you can select these generators to check your new implementation against the original one, using the same random number generator. After verifying that your new program reproduces the original results you can then switch to a higher-quality generator.
Note that most of the generators in this section are based on single linear congruence relations, which are the least sophisticated type of generator. In particular, linear congruences have poor properties when used with a non-prime modulus, as several of these routines do (e.g. with a power of two modulus, 2^31 or 2^32). This leads to periodicity in the least significant bits of each number, with only the higher bits having any randomness. Thus if you want to produce a random bitstream it is best to avoid using the least significant bits.
RANF
. Its sequence is
x_{n+1} = (a x_n) mod m
defined on 48-bit unsigned integers with a = 44485709377909 and m = 2^48. The seed specifies the lower 32 bits of the initial value, x_1, with the lowest bit set to prevent the seed taking an even value. The upper 16 bits of x_1 are set to 0. A consequence of this procedure is that the pairs of seeds 2 and 3, 4 and 5, etc produce the same sequences.
The generator compatibile with the CRAY MATHLIB routine RANF. It produces double precision floating point numbers which should be identical to those from the original RANF.
There is a subtlety in the implementation of the seeding. The initial state is reversed through one step, by multiplying by the modular inverse of a mod m. This is done for compatibility with the original CRAY implementation.
Note that you can only seed the generator with integers up to 2^32, while the original CRAY implementation uses non-portable wide integers which can cover all 2^48 states of the generator.
The function gsl_rng_get
returns the upper 32 bits from each term
of the sequence. The function gsl_rng_uniform
uses the full 48
bits to return the double precision number x_n/m.
The period of this generator is 2^46.
x_n = x_{n-103} ^^ x_{n-250}
where ^^ denote "exclusive-or", defined on 32-bit words. The period of this generator is about 2^250 and it uses 250 words of state per generator.
For more information see,
For more information see,
MTH$RANDOM
. Its sequence is,
x_{n+1} = (a x_n + c) mod m
with a = 69069, c = 1 and m = 2^32. The seed specifies the initial value, x_1. The period of this generator is 2^32 and it uses 1 word of storage per generator.
x_{n+1} = (a x_n) mod m
with a = 1664525 and m = 2^32. The seed specifies the initial value, x_1.
RANDU
generator. Its sequence is
x_{n+1} = (a x_n) mod m
with a = 65539 and m = 2^31. The seed specifies the initial value, x_1. The period of this generator was only 2^29. It has become a textbook example of a poor generator.
x_{n+1} = (a x_n) mod m
with a = 16807 and m = 2^31 - 1 = 2147483647. The seed specifies the initial value, x_1. The period of this generator is about 2^31.
This generator is used in the IMSL Library (subroutine RNUN) and in MATLAB (the RAND function). It is also sometimes known by the acronym "GGL" (I'm not sure what that stands for).
For more information see,
gsl_rng_uni32
. The original source code is available from NETLIB.
t = u_{n-273} + u_{n-607} u_n = t - floor(t)
The original source code is available from NETLIB. For more information see,
The following table shows the relative performance of a selection the
available random number generators. The simulation quality generators
which offer the best performance are taus
, gfsr4
and
mt19937
.
1754 k ints/sec, 870 k doubles/sec, taus 1613 k ints/sec, 855 k doubles/sec, gfsr4 1370 k ints/sec, 769 k doubles/sec, mt19937 565 k ints/sec, 571 k doubles/sec, ranlxs0 400 k ints/sec, 405 k doubles/sec, ranlxs1 490 k ints/sec, 389 k doubles/sec, mrg 407 k ints/sec, 297 k doubles/sec, ranlux 243 k ints/sec, 254 k doubles/sec, ranlxd1 251 k ints/sec, 253 k doubles/sec, ranlxs2 238 k ints/sec, 215 k doubles/sec, cmrg 247 k ints/sec, 198 k doubles/sec, ranlux389 141 k ints/sec, 140 k doubles/sec, ranlxd2 1852 k ints/sec, 935 k doubles/sec, ran3 813 k ints/sec, 575 k doubles/sec, ran0 787 k ints/sec, 476 k doubles/sec, ran1 379 k ints/sec, 292 k doubles/sec, ran2
The subject of random number generation and testing is reviewed extensively in Knuth's Seminumerical Algorithms.
Further information is available in the review paper written by Pierre L'Ecuyer,
On the World Wide Web, see the pLab home page (http://random.mat.sbg.ac.at/) for a lot of information on the state-of-the-art in random number generation, and for numerous links to various "random" WWW sites.
The source code for the DIEHARD random number generator tests is also available online.
Thanks to Makoto Matsumoto, Takuji Nishimura and Yoshiharu Kurita for making the source code to their generators (MT19937, MM&TN; TT800, MM&YK) available under the GNU General Public License. Thanks to Martin L@"uscher for providing notes and source code for the RANLXS and RANLXD generators.
Go to the first, previous, next, last section, table of contents.