Why is Benford's Law true? Suppose not: suppose the first digits of some list, say the length of rivers in meters, were distributed evenly (which is intuitive). Now let's express them in feet (let's approximate 1 m = 3 ft). If the first digit

*in meters*was four or five (or in some cases three or six) then the first digit

*in feet*is 1. On the other hand, for the first digit

*in feet*to be 9, the first two digits in meters

*must*be between 30 and 34. If we started with even distributions of first and second digits

*in meters*, it follows that,

*in feet*, the lengths begin in 1 about 30% of the time, and in 9 about 3.3% of the time. This makes no sense at all because there's nothing special about meters -- the distribution of first digits ought to be the same no matter which units you count in.

Benford's distribution is a sort of fractal: it's the

*only*distribution of first digits that is invariant under conversion from meters to feet, or pounds to kilograms, or people to families, or geese to gaggles. What's special about Benford's distribution is that all the digits of the

*logarithms*are distributed evenly. Since

**log (ab) = log a + log b**, and even distributions stay even whatever you add to them, it follows that a logarithmically even distribution (i.e. Benford's) stays logarithmically even upon rescaling.

It follows that we shouldn't expect Benford's Law to hold whenever we don't expect our descriptions to be scale-invariant: e.g. telephone numbers,

*last*digits in random lists of integers, or any variable with a normal distribution, which implies a characteristic size -- e.g. heights.

## No comments:

Post a Comment