Alias method

In computing, the alias method is a family of efficient algorithms for sampling from a discrete probability distribution, due to A. J. Walker.^[1]^[2] That is, it returns integer values $1 \leq i \leq n$ according to some arbitrary probability distribution $p i$ . The algorithms typically use $O (n log n)$ or $O (n)$ preprocessing time, after which random values can be drawn from the distribution in $O (1)$ time.^[3]

Operation

Internally, the algorithm consults two tables, a probability table $U i$ and an alias table $K i$ (for $1 \leq i \leq n$ ). To generate a random outcome, a fair die is rolled to determine an index into the two tables. Based on the probability stored at that index, a biased coin is then flipped, and the outcome of the flip is used to choose between a result of $i$ and $K i$ .^[4]

More concretely, the algorithm operates as follows:

Generate a uniform random variate $0 \leq x < 1$ .
Let $i = ⌊ nx ⌋ + 1$ and $y = nx + 1 - i$ . (This makes $i$ uniformly distributed on ${1, 2, \dots, n}$ and $y$ uniformly distributed on $[0, 1)$ .)
If $y < U i$ , return $i$ . This is the biased coin flip.
Otherwise, return $K i$ .

An alternative formulation of the probability table, proposed by Marsaglia et. al.^[5] as the “square histogram” method, uses the condition $x < V i$ in the third step (where $V i = (U i + i - 1)/ n$ ) instead of computing $y$ .

Table generation

The distribution may be padded with additional probabilities $p i = 0$ to increase $n$ to a convenient value, such as a power of two.

To generate the table, first initialize $U i = np i$ . While doing this, divide the table entries into three categories:

The “overfull” group, where $U i > 1$ ,
The “underfull” group, where $U i < 1$ and $K i$ has not been initialized, and
The “exactly full” group, where $U i = 1$ or $K i$ has been initialized.

If $U i = 1$ , the corresponding value $K i$ will never be consulted and is unimportant, but a value of $K i = i$ is sensible.

As long as not all table entries are exactly full, repeat the following steps:

Arbitrarily choose an overfull entry $U i > 1$ and an underfull entry $U j < 1$ . (If one of these exists, the other must, as well.)
Allocate the unused space in entry $j$ to outcome $i$ , by setting $K j = i$ .
Remove the allocated space from entry $i$ by changing $U i = U i - (1 - U j) = U i + U j - 1$ .
Entry $j$ is now exactly full.
Assign entry $i$ to the appropriate category based on the new value of $U i$ .

Each iteration moves at least one entry to the “exactly full” category (and the last moves two), so the procedure is guaranteed to terminate after at most $n -1$ iterations. Each iteration can be done in $O (1)$ time, so the table can be set up in $O (n)$ time.

Vose^[3]^:974 points out that floating-point rounding errors may cause the guarantee referred to in step 1 to be violated. If one category empties before the other, the remaining entries may have $U i$ set to 1 with negligible error.

As the lookup procedure is slightly faster if $y < U i$ (because $K i$ does not need to be consulted), one goal during table generation is to maximize the sum of the $U i$ . Doing this optimally turns out to be NP hard,^[5]^:6 but a “Robin Hood” heuristic comes reasonably close: rob from the richest and give to the poorest. That is, at each step choose the largest $U i$ and the smallest $U j$ . Because this requires sorting the $U i$ , it requires $O (n log n)$ time.

Efficiency

Although the alias method is very efficient if generating a uniform deviate is itself fast, there are cases where it is far from optimal in terms of random bit usage. This is because it uses a full-precision random variate $x$ each time, even when only a few random bits are needed.

One case arises when the probabilities are particularly well balanced, so many $U i = 1$ and $K i$ is not needed. Generating $y$ is a waste of time. For example if $p 1 = p 2 = 1 ⁄ 2$ , then a 32-bit random variate $x$ could be used to make 32 choices, but the alias method will only generate one.

Another case arises when the probabilities are strongly unbalanced, so many $U i \approx 0$ . For example if $p 1 = 0.999$ and $p 2 = 0.001$ , then the great majority of the time, only a few random bits are required to determine that case 1 applies.

In such cases, the table method described by Marsaglia et al.^[5]^:1–4 is more efficient.

Literature

Knuth, Art of Computer Programming, Vol 2: Seminumerical Algorithms: Sect. 3.4.1.

Implementations

http://www.keithschwarz.com/darts-dice-coins/ Keith Schwarz: Detailed explanation, numerically stable version of Vose’s algorithm, and link to Java implementation
http://apps.jcns.fz-juelich.de/ransampl Joachim Wuttke: Implementation as a small C library.
http://oroboro.com/non-uniform-random-numbers Rafael Baptista’s Implementation in C++

References

↑ Walker, A. J. (April 1974). "New fast method for generating discrete random numbers with arbitrary frequency distributions". Electronics Letters. 10 (8): 127. doi:10.1049/el:19740097.
↑ Walker, A. J. (September 1977). "An Efficient Method for Generating Discrete Random Variables with General Distributions". ACM Transactions on Mathematical Software. 3 (3): 253. doi:10.1145/355744.355749.
1 2 Vose, Michael D. (September 1991). "A linear algorithm for generating random numbers with a given distribution" (PDF). IEEE Transactions on Software Engineering. 17 (9): 972–975. doi:10.1109/32.92917. Archived from the original (PDF) on 2013-10-29.
↑ Darts, Dice, and Coins: Sampling from a Discrete Distribution. KeithSchwarz.com. Retrieved on 2011-12-27.
1 2 3 Marsaglia, George; Tsang, Wai Wan; Wang, Jingbo (2004-07-12), "Fast Generation of Discrete Random Variables", Journal of Statistical Software, 11 (3): 1–11, retrieved 2012-07-14

This article is issued from Wikipedia - version of the 6/14/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.