How many piano tuners are there in Chicago?
Questions like that were once a staple in the interview process for high paying corporate jobs, though fortunately, this practice has since been discontinued. They were designed to test how adeptly the interviewee could break down a problem with few, if any, known quantities and formulate a strategy to solve that problem. Essentially, they wanted to see how well the interviewee could work through a Fermi problem.
Fermi estimations
A Fermi problem, also known as Fermi estimate, is an estimation problem. The solution to such a problem is a rough, high-level estimate of a quantitative figure based on either: data points that are already known, or reasonable estimates of those data points. The process of estimating the number of piano tuners in Chicago might involve considering how large the city is, how many pianos there are (this could be its own Fermi problem), how many types of events need a piano, how often pianos need to be tuned, or any other number of potential data points. There are numerous ways to tackle a Fermi problem, all of which will give reasonably accurate estimations if done correctly.
Estimation accuracy
How accurate is a Fermi estimation? If the estimation process is conducted reasonably well, the result will almost certainly be within an order of magnitude of the actual value.
An order of magnitude is a way of roughly describing the scale of a number in terms of exponential powers, usually powers of 10. If two numbers were to differ by one order of magnitude, one of those numbers is roughly ten times larger or smaller than the other. For example, let’s compare the number of people in a restaurant to the number of people at a large music concert. Today, there are 10 people in the restaurant and 10,000 attendees at the concert. This is an example of a difference of three orders of magnitude, as the concert has 1,000 times more people than the restaurant (10^3 = 1,000).
Orders of magnitude are useful for comparing high-level differences in sizes between figures, either when exact values are not necessary, or when the differences in magnitude are so large that they totally eclipse any smaller, more precise differences between them. Estimations that fall within the correct order of magnitude are immensely useful for predictive purposes.
Fermi estimation in action
I’m currently writing a series of short stories centered on two modern-day private investigators who, unlike their predecessors, have the benefit of modern science and methodology to aid their investigations. In one of the stories, one of the investigators uses the process of Fermi estimation to help plan a course of action for finding a missing person. Consider the following problem, a simplified version of the Fermi problem the protagonist must solve:
How many restaurants are there in Woodbridge, NJ?
Let’s say you have access to the internet to find basic facts about Woodbridge, NJ, but you can’t find a reliable figure on the number of restaurants in Woodbridge. We’ll make our estimation using three points of data:
-
- Population: 100,450
- Land Size: 23.26 square miles
- Population Density: 4,319 people per square mile
-
- This figure was derived from the first two data points
-
Now, let’s make a series of plausible assumptions that should be, on average, close to being correct. These assumptions will apply to the “average restaurant.” Most restaurants should somewhat reflect these assumptions, and those that don’t in one extreme or the other should still average out to this, if our assumptions are close enough to the truth:
-
- The average restaurant can fit 20 people comfortably.
- It will have an average of 40% capacity filled at any given time.
- It will be open an average of 12 hours a day.
- The average customer visit will last 40 minutes.
- People order from a restaurant an average of 3 times every 2 weeks.
- The population density corresponds to the number of restaurants.
- Let’s assume that a density of 5,000 people per square mile balances out supply and demand for restaurants. This density has many people to demand service but wouldn’t be too crowded or burdensome.
- 4,319 / 5,000 = 0.86.
Now, we’ll do some quick calculations based on our data points and assumptions:
- The average restaurant will have 144 people a day.
- 40% * 20 * [12*60/40]
- There will be 21,525 people eating in restaurants on any given day.
- 100,450 * 3/14
- We can conclude that there are roughly 129 restaurants in Woodbridge.
- 21,525/144 * 0.86
Intuitively, 129 sounds like a reasonable number. I did some quick searches online to see how close this approximation is to the real value, but I wasn’t able to find a definitive answer. Tripadvisor shows 91 results, Google maps shows 111, and other sources show other numbers, all of which are slightly lower than the estimate of 129. Still, the estimate is well within the same order of magnitude as the real answer.
How are Fermi estimates so accurate?
In the example above, we only used three data points, one of which was derived from the other two. However, I selected those particular data points with care. I reasoned that, in a capitalistic society, supply and demand tend to catch up with each other: if demand increases, such as a surge in population in this example, then demand will catch up in the form of new restaurants opening. If supply increases, such as construction projects to add more food, sights, and amenities to a town or city, more people will move there to take advantage of them. On average, at least, barring unusual circumstances. If that presumption is correct, then the data points we picked would essentially have the information we need encoded within them, allowing us to take advantage of the real-world forces that drive the restaurant industry and let them do the heavy lifting.
In general, Fermi estimations are accurate because:
-
- Errors cancel out: when you make multiple assumptions that meet a certain level of reasonableness, the errors within those assumptions will tend to cancel each other out.
-
- Experience helps: if you have prior experience with that which you’re trying to estimate, that experience will drive your estimations and will make them much more accurate.
-
- Known data helps: the more you can use readily available data, the more accurate your estimate will be. If you don’t have data available, close-enough approximations will have nearly the same positive impact.
-
- Large numbers eclipse small differences: when dealing with large numbers, the impact of individual errors diminishes.
-
- Data points encode other data points: in the real-world, data points aren’t just dots on a screen, they’re the results of natural and artificial forces at work in a complex, interconnected system. The usage of any data points implicitly invokes its driving forces within the estimation.
When should we use Fermi estimation?
Fermi estimation is a great mechanism for sketching out solutions when you’re conducting high level research or planning, creating a hypothesis before you’ve collected substantial data, making decisions while under time pressure, and much more. They can give you an unreasonable level of insight into some field with a disproportionally small amount of input data and precision in calculation. Fermi estimation is not just mathematical exercise, it’s a powerful tool for navigating through uncharted territories and potentially making deep insights in areas of low visibility.