PRAND: A Parallel Random Number Generator

Jason Main and Amit Jain
Boise State university
Boise, Idaho, USA 83725

This material is based upon work supported by the National Science Foundation under Grant No. 0321233. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

The prand library is meant to implement a block parallelization of glibc's standard random function. Please see the download section on how to download and install the library. It works on both 32-bit and x86_64 bit Linux systems.

Once the random()'s internal state has been set by the functions of the prand library, the regular random() function can be called to generate random() numbers. The idea is to break up the numbers so that that the union of all random numbers used by the process is the same as it would be for 1 process generating a contiguous sequence of random numbers. Say that letters represent the return values from successive calls to random(). Then the situation we wish to duplicate is:
The serial process calls random() 50 times receiving the random numbers....

SERIAL: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWX

Each process calls random() 10 times receiving the values...

PROCESS0: abcdefghij
PROCESS1: klmnopqrst
PROCESS2: uvwxyzABCD
PROCESS3: EFGHIJKLMN
PROCESS4: OPQRSTUVWX

Leapfrog parallelization is not currently supported. Neither is independent sequence parallelization. The principle function used for this is called unrankRand(). unrankRand() permutes the state so that each process effectively starts with its random() function so that the numbers it generates correspond to its block of random numbers. It takes a parameter called stride that represents how many random() calls from the current state you want to simulate. In other words the following code snippets are functionally equivalent (although unrankRand() is faster):

    ITERATIVE               
    for(i=0;i<1000000;i++)
        random();

    USING UNRANK
    unrankRand(1000000);

To use this in a parallel program we would use the following format:

SERIAL:
srandom(SEED);    //Consume the whole range of random numbers
for    (i=0;i<n;i++) {
    tmp = random();
    ...

PARALLEL:        //Each process uses a fraction of the total range...
srandom(SEED)
unrankRand( myProcessID * (n/numberOfProcessors) );
for    (i=0;i < (n/numberOfProcessors);i++) {
    tmp = random();
    ...

Some care must be taken if n does not evenly divide by the number of processors. Functions are provided that simplify this. In addition unrankRandom() works on the number of randoms consumed not the number of loop iterations so if each process consumes say 2 random numbers for every loop, then the call to unrank must be adjusted so that n is 2 times the number of loop iterations.
NOTE: unrankRandom()'s runtime is typically less than 10 milliseconds for a gigahertz machine or better. For problems that take less than a second to solve this may be significant overhead. In this case you can set up random() with a linear congruent generator with something like:

    initstate(SEED, malloc(8), 8);    //Force random() to use Linear congruent
    unrankRand(wherever);

unrankRand() for linear congruent is very quick (on the order of 2 microseconds for gigahertz machines). The bigger question would be 'What are you doing that you have to parallelize something that only takes a second?'

How to download and install

There are three ways of installing: