An interesting phenomenon of naturally occurring numbers is that the leading digit ‘1’ occurs with surprising frequency, that is, about 30% of the time. This is known as Benford’s Law and is discussed in a number of places (Wikipedia, Wolfram, Cut-the-Knot, NY Times). Statisticians can use Benford’s Law to try to detect fake data that people generate, probably with a simple Uniform(0,1) function such as rand() in so many programming languages.

What I wanted to do was generate random numbers that complied with Benford’s Law. Impatient? Generate some random Benford numbers now.


You mean, “Why am I trying to cheat more effectively?” No, but if I am trying to generate sample datasets for pedagogical purposes, I would like to use the most realistic fake numbers that I can.


My script generates one digit at a time, and the likelihood of a particular digit 0..9 occurring depends on its place in the number. For example in generating a four digit integer, the first digit will be a ‘1’ 30% of the time, but the second digit will be a ‘1’ only 12% of the time. After the second digit, the numbers occur (in the script) with equal probability.

I use the following table as the basis for my calculations

Digit First Place Second Place
0 0 0.1197
1 0.3010 0.1139
2 0.1761 0.1088
3 0.1249 0.1043
4 0.0969 0.1003
5 0.0792 0.0967
6 0.0669 0.0934
7 0.0580 0.0904
8 0.0512 0.0876
9 0.0458 0.0850
Benford’s Law Probabilities (source Simon Newcomb)

The simplest case is when I am generating a fixed number of digits. I know that the first digit is never a zero, so I can use the tables exclusively.

In the case where I want to generate all integers up to a certain point, I have to be a bit more sneaky. Suppose I want to generate integers from [1..35]. I will begin by generating a digit, say, 4. I check to see if 4 is the largest number ≤ 35 that I can generate that starts with a 4, and sure enough 4×10=40 is greater than 35, so I stop there. Voila: a single digit number.

Suppose that in generating integers from [1..35], I first generate a 2. It is possible that I could generate a second digit and end up with, say, 27, so the above test will not suffice. Next I check the probability that any uniformly-distributed integer from [1..35] will be a single digit (9 out of 35), and if a random number draw gives me this probability, I simply return the value 2 and leave it at that.

The Script

I am hosting the script on my SourceForge pages here: I had started with a JavaScript version, but I thought a PHP-based script would be more useful.

PURPOSE: Generates random numbers that comply with Benford's Law.

 help         Display this help message (default behavior).

 source       Echoes the source code for this script.

 count        The number of numbers to generate (default is 100)
              ex: .../benford.php?count=200

 format       Instead of upto generate numbers with the given
              format, where X signifies a digit and any other
              character is simply echoed back.
              ex: .../benford.php?format=X.XXX

 upto         Generate numbers from 1 to this value [1..upto]
              instead of fixed length numbers, as with 'format'.
              ex: .../benford.php?upto=150

 includeZero  When used with upto the number zero will be
              included in the random numbers [0..upto].

LICENSE: This code is released as Public Domain.
AUTHOR: Robert Harder, rob _


To generate random house numbers for fake addresses, try to generate numbers from 1 to 9999 (1-, 2-, 3-, and 4-digit house numbers).

To generate random car prices, try