Comments on: Always Watch Out for Number One https://www.damninteresting.com/always-watch-out-for-number-one/ Fascinating true stories from science, history, and psychology since 2005 Mon, 04 Jul 2022 18:23:55 +0000 hourly 1 https://wordpress.org/?v=6.8.1 By: jarvisloop https://www.damninteresting.com/always-watch-out-for-number-one/#comment-74468 Mon, 04 Jul 2022 18:23:55 +0000 https://www.damninteresting.com/?p=210#comment-74468 Some persons cheat on their taxes? Perish the thought!

]]>
By: Rabid Burrito Man https://www.damninteresting.com/always-watch-out-for-number-one/#comment-71589 Mon, 06 Jun 2016 23:45:19 +0000 https://www.damninteresting.com/?p=210#comment-71589 Seemed nonsensical until I thought in terms of something no one seems to have mentioned: the range of the data set, the ballpark estimate of your highest and lowest values:

If your numbers seem to be randomly distributed between 10,000 and 100,000 we have approximately 10k numbers that start with each digit. As such we shouldn’t see bias one way or another.

If your numbers are randomly distributed between 10k and 200k, you have around 110k that start with 1 and 10 k that start with all others.

In practice we’re probably going to see a heavy concentration in some ranges while others become a lot rarer.

I don’t feel inclined to double check file sizes or anything. I will point out that if you have a something like a middle school student who’s required to religiously follow a five paragraph essay format, all those essays will have an approximately equal file size, and as such is likely to share a common first digit. The same is true of pictures loaded off a specific digital camera. Or any number of specialized cases.

We could easily end up with several files that are say, 500 megabytes or whatever.

Everything depends on what kind of trends are prevalent in the dataset.

]]>
By: Museful https://www.damninteresting.com/always-watch-out-for-number-one/#comment-27537 Tue, 11 Jun 2013 09:32:06 +0000 https://www.damninteresting.com/?p=210#comment-27537 It is clever and DI but not so mysterious.

Imagine designing an electronic number display to display some “real life quantity”. If the display currently has 3 digits, you can display from 0 to 999, which is 1000 numbers.

If you now add the ability to display a preceding “1”, you have doubled the range of numbers (0-1999, which is 2000 numbers). If you enable that extra digit to optionally be a “2”, you have only increased your range of numbers by 50%. For 3 it is 33%, for 4 it is 25%, etc. All of this holds no matter how many digits the display had originally. In general, incrementing the upper bound of the first digit yields diminishing returns compared to the range that could already be expressed before the increment.

Since “real life numbers” follow exponential distributions (as opposed to “completely random numbers” which follow uniform distributions over an artificial interval), the percentage by which a digit extends the range is an indication of how likely the digit is to be needed.

]]>
By: monstermac77 https://www.damninteresting.com/always-watch-out-for-number-one/#comment-26151 Fri, 20 Aug 2010 17:23:17 +0000 https://www.damninteresting.com/?p=210#comment-26151 Has anyone heard of the law of least action? It’s a principle that’s seen when calculating various natural phenomenon such as the trajectory of light through a multi-substance plane. It’s also a common example problem in Calculus. If you have two planes, and you travel at 10 meters per second in plane 1, and 20 in plane 2, what is the trajectory that requires the least time? It’s a minimization problem, and is quite simple. You’ll notice the trend that as the velocity traveled in the various planes becomes more disparate; the distance traveled in the slower plane becomes smaller (closer to perpendicular with plane 2). And light refracts in a path that allows for the least amount of action. The equation for action is complicated. I don’t remember it, but it takes into account potential and kinetic energy. Anyway, this may be the reason for Benford’s law. If you imagine the possible path of light through a substance being quantified in terms of action, and the possible range is 1,000-10,000. Based on the principle of least action, one may say it is most probable that the light will travel with the action of 1,000-1,999, and least of 9,000-9,999. With a sliding scale (down in probability) through the numbers 1-9. This lines up perfectly with Benford’s law.

]]>
By: allduerespect88 https://www.damninteresting.com/always-watch-out-for-number-one/#comment-22988 Fri, 03 Oct 2008 12:51:27 +0000 https://www.damninteresting.com/?p=210#comment-22988 A1c: You had ice-cream for dinner?
You’re my hero

]]>
By: a1c https://www.damninteresting.com/always-watch-out-for-number-one/#comment-22480 Sun, 10 Aug 2008 01:50:55 +0000 https://www.damninteresting.com/?p=210#comment-22480 Many universities accept term papers only electronically, and share them with other universities to pattern-match for cheaters.

Also, anyone forced to give a number between 1 and 10 often gives 7, with 3 being the 2nd most likely statistic.

In other news, I had 1,234 scoops of ice cream for dinner last night.

]]>
By: bc5431 https://www.damninteresting.com/always-watch-out-for-number-one/#comment-20510 Mon, 17 Mar 2008 22:00:32 +0000 https://www.damninteresting.com/?p=210#comment-20510 [quote]rexodus said: “For instance, look at average height (in feet) for 20-year-old males. It’s a bell curve distribution (or close to one) with 5 as the most common first digit. Sure, you could convert this to meters, and the most common digit would be 1. But that’s just a matter of shifting the curve. If you shifted it again into inches, the most common first digit is 6.”[/quote]

I think the problem with your logic here is that you data set is too constrained, which will, as Alan pointed out at the beginning of the article, cause it to fail Benford’s Law. It is not to constrained because it is too small a data set, but because the range is too small. That is, the range between smallest number and largest number is too small. According to Wikipedia, the shortest fully grown man is 2’11” while the tallest man in recorded history was 8’11”. Thus there is only an approximately 300% increase from shortest to tallest, regardless of what unit system you use. Compare this to the range in river lengths or yearly incomes, to go back to the two examples so heavily cited above. To run the full range of starting-digit-1 to starting-digit-9
requires a 900% increase in value, again regardless of what unit system is used. Bedford’s Law requires that the numbers involved traverse this range at least once , otherwise at least 1 number will always have a 0% occurrence in the opening position and the number set will thus immediately fail the test.

So, I think the bell curve distributions will still follow Bedford’s Law, but only when the distribution covers a sufficiently wide enough range of numbers, not just a sufficiently large enough set. That, it appears, is a key distinction to “too constrained” which it does not seem was made above.

]]>
By: rexodus https://www.damninteresting.com/always-watch-out-for-number-one/#comment-19737 Tue, 29 Jan 2008 19:32:25 +0000 https://www.damninteresting.com/?p=210#comment-19737 [quote]ahecht said: “The reason it works for something natural like rivers (or mountain), but not for truely random numbers like all the digits from 1 to 100 is that Bedford’s Law only applies to data sets that have a normal “bell curve” distribution. In other words, if higher numbers are more rare than lower numbers, you can apply the law.”[/quote]

Actually, this Law does NOT work for bell curve distributions. In a perfect bell curve, the median and the mode are the same value. This means high numbers and low numbers are equally rare, with numbers somewhere in between being most common. For instance, look at average height (in feet) for 20-year-old males. It’s a bell curve distribution (or close to one) with 5 as the most common first digit. Sure, you could convert this to meters, and the most common digit would be 1. But that’s just a matter of shifting the curve. If you shifted it again into inches, the most common first digit is 6.

As long as the curve is bell-shaped, the most common first digit will be whatever the first digit of the median is.

But river length is not a bell curve distribution. Neither is amounts on a tax form. In both of those distributions, the median and the mode are not the same. If the distribution were unimodal, which it may not be, the mode would not be the same as the median. There would be a large number of rivers a little bit shorter than the median, and a few rivers extremely longer than the median. The distribution is much more likely to look like this: http://www.gatsby.ucl.ac.uk/~turner/Benford's%20law/Benford%20Tea%20Talk_files/Benfords%20law3.png
This is because short rivers are much more common than long ones. And, similarly, small dollar amounts are much more likely than large ones.

Take a look at that distribution and adjust the scale however you like. The most common data points will begin with 1 in every scale.

]]>
By: Alchemist https://www.damninteresting.com/always-watch-out-for-number-one/#comment-17935 Tue, 23 Oct 2007 21:12:29 +0000 https://www.damninteresting.com/?p=210#comment-17935 [quote]Alan Bellows said: “Good point.

0.00000164 lightyears = 9,640,739.7 miles… that’s one LONG river.”[/quote]

which is 1/10 or (.1) the approximate distance from the earth to the sun.

]]>
By: mb799 https://www.damninteresting.com/always-watch-out-for-number-one/#comment-16020 Thu, 28 Jun 2007 11:44:23 +0000 https://www.damninteresting.com/?p=210#comment-16020 There are a number of articles on the use of Benford’s law , in addition to software and tutorials on the use of the law. It has extensive applicability in auditing and forensic accounting.

]]>