If this is your first time reading, please check out the overview for Poor Ash’s Almanack, a free, vertically-integrated resource including a latticework of mental models, reviews/notes/analysis on books, guided learning journeys, and more.
Nonlinearity Mental Model: Executive Summary
If you only have three minutes, this introductory section will get you up to speed on the nonlinearity mental model.
The concept in one sentence: our experience of the world is generally linear (i.e. occurring in straight lines), but many important phenomena are nonlinear.
Key takeaways/applications: An understanding of some of the most important forms of nonlinearity – critical thresholds, dose-dependency, exponential growth / compounding, and power laws – can help us make more appropriate decisions.
Four brief examples of nonlinearity:
Give this monkey what she wants! Roughly 96% of the human genome is shared by chimpanzees, but the remaining 4% – as well as differences in gene expression – create profoundly nonlinear differences in our abilities. In “ The Up Side of Down” ( UpD review + notes), for example, Megan McArdle explores how chimpanzees are unable to cooperate at scale.
Similarly, as explored in books like Sam Kean’s “ The Violinist’s Thumb” ( TVT review + notes), the mutation or deletion of merely one or two nucleotides (“letters’) of the three billion total in our genome can result in profound disabilities.
Come on in, the water’s fine… or fatal, take your pick. Severely nonlinear situations can make feedback difficult to comprehend. In “ Seeking Wisdom” ( SW review + notes), Peter Bevelin provides the example of the farmer who feeds a chicken every day until, one day, he cuts his head off. It’s often difficult for us to comprehend the magnitude of forces outside of our artificially-controlled environments, as Laurence Gonzales overviews in “ Deep Survival” ( DpSv review + notes) using examples as varied as the physics of falling climbers and undertow on Hawaii beaches.
Two is a treat; twelve is pledge hazing. One extremely common mental error among even intelligent, highly educated people is ignoring dose-dependency – assuming, in other words, that if some is good, more is better. Jordan Ellenberg riffs on this hilariously (“how Swedish is too Swedish?”) in “ How Not To Be Wrong” ( HNW review + notes), and examples pop up everywhere. For example, a belief in agency (free will / willpower) is one of the most profoundlyadaptive and success-predicting mindsets around, but as I discuss in thatmental model, it can be taken way too far – extreme willpower, or “ grit,” is vastly overrated and usually a sign that something’s going very wrong, not right.
If this sounds interesting/applicable in your life, keep reading for deeper understanding and unexpected applications.
However, if this doesn’t sound like something you need to learn right now, no worries! There’s plenty of other content on Poor Ash’s Almanack that might suit your needs. Instead, consider checking out our discussion of the storytelling, association bias, or empathy mental models, or our reviews of great books like “The Signal and the Noise” (SigN review + notes), “Internal Time” (IntTm review + notes), or “10% Happier” (10H review + notes).
A Deeper Look At The Nonlinearity Mental Model:
“When you’re starting a company it never goes at the pace you want or the pace you expect.
You imagine everything to be linear… you start, you build it, and you think everyone’s going to care. But no one cares, not even your friends.”
I love this quote because it highlights the challenges of nonlinearity interacting with our schema. Today, it’s obvious that AirBnb is a phenomenally successful business that *everyone* cares about, but it was not whatsoever obvious early on. Similar lessons can be found in the Starbucks origin story – Howard Schultz’s “ Pour Your Heart Into It” ( PYH review + notes).
Now, that’s not to say that AirBnb was guaranteed to work; “ The Upstarts”does a good job of overviewing the luck and path-dependency elements there by highlighting how many competitors with similar business models ended up going nowhere. Nonetheless, the example illustrates the need to incorporate a thorough understanding of the various forms of nonlinearity into our latticework of mental models.
The Upstarts reveals how even tenured, highly-respected venture capitalists passed on AirBnb: Fred Wilson of Union Square Ventures noted:
“We made the classic mistake that all investors make… we focused too much on what they were doing at the time and not enough on what they could do, would do, and did do.”
Indeed, AirBnb almost didn’t get into YCombinator. In their interview, as they described the then-novel home-sharing concept, Paul Graham initially asked them:
“People are actually doing this? Why? What’s wrong with them?”
When we’re looking at a website that feels like it’s held together with fishing twine and duct tape, or a coffee shop that we’re nurturing into existence, that only six people visited yesterday (one of whom was our mom), it can be very hard to visualize the requirements – and challenges – that would unfold if it scaled up.
This challenge of visualization is one of the reasons that, as we’ll examine, even highly intelligent, educated people often fail to save enough for retirement: even if you mathematically understand compound interest, it is difficult to apply it intuitively in everyday life, which often looks pretty linear.
Nonlinearity pops up all over the place in business and life. As explored in Dr. Matthew Walker’s “Why We Sleep“ (Sleep review + notes), the difference in health and productivity between a full 8+ hours of sleep and 7 hours is not just 10 – 15%, thanks to compounding. The same applies to other situations: for example, Foundation Principle #1 at The Container Store (as discussed in Kip Tindell’s “ Uncontainable” – UCT review + notes) is:
“1 Great Person = 3 Good People.”
It won’t surprise you to learn that The Container Store invests an insane amount of employee time in training, an activity with a highly nonlinear payoff.
Relative to most mental models, nonlinearity bears a bit more direct conceptual exploration, so I’ll depart from the usual format here and focus on various forms of nonlinearity. You’ll find interactions between these forms of nonlinearity and the other mental models in the latticework scattered all over the site… I’ll provide some links at the end.
Nonlinearity Type 1: Critical Thresholds / Phase Changes
You may remember from chemistry class – I know, I know, you may not want to remember anything from chemistry class, but you do – that matter typically occurs in three “phases” – solid, liquid, and gas. (There are technically more; for example, there’s plasma, and there are actually ~seventeen distinct forms of solid ice, but we’ll ignore that for now.)
This is the easiest visualization of the idea of critical thresholds: sometimes the world doesn’t work in smooth curves. It is difficult, for example, to extrapolate anything about the behavior of steam from the behavior of ice. Here is how water behaves at various temperatures:
|below 32 F / 0 C||solid (ice)||Is a very good dog. Stays exactly where you tell it to. Does not move. There was this weird rumor about callously sinking an entire ship but she’s a sweet girl and I know she would never do such a thing. Kind of chilly to cuddle, though…|
|32 F – 212 F / 0 C – 100C||liquid (water)||Is a good dog, mostly. Finds the lowest point of the container you put it in and chills out there (unless there’s a local vs. global optimization hill-climbing problem), but doesn’t try to escape the container. May cause damage to furniture and flooring if left alone for too long. Is occasionally featured on science-fair petitions to ban dangerous chemicals.|
|above 212 F / 100 C||gas (steam)||IS A VERY BAD DOG. Acts unruly and disruptive, “manspreading” to the extreme, filling any container you put it in and escaping your yard no matter how high your fence is.|
These phase changes are an example of critical thresholds, in which you have dramatically different, step-change type behavior once you cross a certain point. For example, in “ Deep Survival” ( DpSv review + notes), Laurence Gonzales discusses how 35 degrees is a “critical angle” for avalanches to occur: on steeper slopes, snow slides down rather than consolidating; on shallower ones, the snow remains stable.
The most famous example of a critical threshold, of course, is the critical mass required for a nuclear chain reaction. As is discussed vividly on pages 439 – 440 of Richard Rhodes’ “ The Making of the Atomic Bomb” ( TMAB review + notes), the line between the first chain reaction occurring and not was incredibly fine: the control rod was pulled literally 12 inches from what was, up until that point, a ~million-pound pile of uninteresting graphite, uranium oxide, and uranium metal.
Critical thresholds translate to business and life as well. For a business, for example, breakeven profitability – or debt covenant coverage – can be critical thresholds above or below which behavior is very different.
The same can apply for physical systems: Jonathan Waldman discusses in “ Rust: The Longest War”( RUST review + notes) how the Trans-Alaskan pipeline has to be kept at above a certain temperature: if it reaches that temperature, the hydrocarbons inside will gel – permanently – and it’s good-night for the pipeline.
Similarly, Geoffrey West’s “ Scale” ( SCALE review + notes), which we’ll review in more depth later, brings up the “Dunbar Number” – the idea that, thanks to the limits of human memory and other cognitive functions, we’re only able to maintain about ~150 stable relationships at any given time.
The exact number is less important than the fact that, generally, there is some critical threshold at which behavior changes dramatically: in small tribes or groups, social proof and reciprocity bias can allow socialistic behavior to work well. In larger groups, when nobody can keep track of everyone else face to face, incentives-driven local vs. global optimization problems start to pop up.
This is one reason why most entrepreneur stories, such as the aforementioned “ Uncontainable” ( UCT review + notes), include a phase where the company has to adjust to new challenges because management policies and cultural practices that were efficient and effective at 20, 50, and 100 employees start to break down at the “critical threshold” where everyone no longer knows everyone else’s name.
Critical thresholds have also been used to explain violent and antisocial behavior (here’s a good, albeit perhaps emotionally challenging to read, long-form exploration in The New Yorker). This concept is often expressed with the metaphor of riots: each person has their own “riot threshold” at which they’ll join in a riot – i.e., “I’ll join in a riot if X other people are rioting.”
So, in a group of 10 people, if we assume riot thresholds of 0 – 9 sequentially, then we’ll have a 10-person riot by the end of the night. But if we simply remove the person with a “0” threshold – and, perhaps, a “1” threshold for margin of safety – then nobody will start rioting, because there’s nobody there to start the riot.
Application/impact: critical thresholds / phase changes mean that extrapolating from a sample size on one side of the critical threshold may yield woefully inadequate results; similarly, behavior that is adaptive on one side of a critical threshold may be maladaptive on the other side. Be on the lookout for situations where critical thresholds exist.
Dose-Dependency: Nonlinearity x Critical Threshold (x Margin of Safety)
“The curve of a missile’s flight is emphatically not a line; it’s a parabola. [But] just like Archimedes’s circle, it looks like a line close up, [so] the linear regression will do a great job telling you where the missile is five seconds after [you last tracked it].
But an hour later? Forget it. Your model says the missile is in the lower stratosphere, when, in fact, it is probably approaching your house.
[…] You can do linear regression without thinking about whether the phenomenon you’re modeling is actually close to linear. But you shouldn’t […] the results can be gruesome.
I slightly rearranged that quote from Jordan Ellenberg’s “ How Not To Be Wrong” ( HNW review + notes) to make it more clear in this context. Hopefully it illustrates the concept of dose-dependency: in the real world, most trees don’t grow to the sky, and most missiles eventually fall onto your house – especially if you’re a Colorado fourth-grader named Tweek.
Dose-dependency is thus an important type of nonlinearity. Ellenberg, in his fantastic book on mathematical thinking, provides other examples of the missile-arc trajectory. This sort of inverted U – where some is better than none, but more is not better than some – pops up all the time.
Let’s see how it interacts with critical thresholds in the context of medicine. Say you have a gnarly headache that’s making you miserable, so you decide to take some Tylenol (acetaminophen, also known as paracetamol). How many should you take if you just want your headache to go away?
Well, the easy answer is “at least a couple” – at first, there’s probably a linear-ish relationship between your Tylenol dosage and your pain level, so if we’re talking about ~300 mg pills, two is twice as good as one.
But three (900 mg) may not be thrice as good, and four (1200 mg) may not be any better than three. Here the line starts to level off and form a new line.
Why? Pain is a multicausal phenomenon; Tylenol only has a limited mechanism of action to block certain channels of pain, so by a certain point it’s doing all it can, and (at least from my experience when I got my wisdom teeth pulled) that channel of pain is not the bottleneck – so taking more Tylenol will yield no additional benefit… your pain will be more or less at a lower, hopefully tolerable level. It’s a bit like caffeine; at some point, no amount of additional caffeine will keep you awake; it’ll just give you nasty side effects.
If you keep taking Tylenol infinitely, well, guess what, the direction of the line changes again, going pretty much straight vertical: YOUR LIVER FAILS. Tylenol is generally considered very safe below a certain threshold, but above 3,000 mg per day (merely six extra-strength 500 mg pills), you might experience irreversible liver damage. It’s a bit like contracting dysentery on the Oregon Trail: one minute you’re merrily massacring bison, the next you’re dying in agony.
The dose-dependency on this end is even more important, almost approaching a critical threshold: it’s not clear exactly where your Tylenol dosage will go from “perfectly safe” to “very dangerous and maybe even fatal,” but it happens pretty quickly and is in fact one of the leading causes of poisonings seen by hospitals.
This, again, is multicausal: due to the critical threshold / very low margin of safety between the maximum “safe” dose and the minimum “unsafe” dose, as well as the fact that acetaminophen is often an ingredient in combination medicines – my dad, for example, turns to Theraflu when he has a cold, and Theraflu (at least the type he takes) contains a significant amount of acetaminophen. So don’t mix Theraflu and Tylenol. Other pain pills, such as Percocet, contain both an opioid and acetaminophen.
This is one of the reasons I prefer Advil (ibuprofen) on the rare occasion I need a pain pill – it’s near-impossible to overdose on. I am not a doctor and this is not medical advice, but I believe you’d need to take something like a hundred standard 200mg ibuprofen pills before you really had to worry… this, of course, isn’t to say you should take more than the daily recommended maximum (usually about 1200 mg), but nonetheless, there’s a lot more margin of safety between the amount you’d want to take, the maximum you’re recommended to take and are likely to take if you’re not paying attention, and the amount you can safely take. Considering the relative commonality of both medicines, ibuprofen toxicity is comparatively exceedingly rare.
(The other reason I prefer Advil is that it’s a general anti-inflammatory, so it treats sources of pain that Tylenol doesn’t and can’t. There are circumstances, however, where Tylenol is preferable – NSAIDs are generally viewed as a blood thinner, for example.)
“Two Jelly Donuts are an indulgent breakfast.
Twelve Jelly Donuts is fraternity pledge hazing.”
In humans, for example, many people with a high “need for cognition” and desire for intellectual stimulation are drawn to investing as a career, and some of this is definitely necessary and valuable.
Too much, however, can be bad: a lot of investors fall down the trap of solving hard problems for the sake of solving hard problems – jumping over ten-foot hurdles when there are half-foot ones nearby – when there’s no utility in doing so. As Buffett/Munger would put it,
“There are no points for difficulty.”
There are, of course, many other applications of dose-dependency: another one could be oversight/verification in the context of business management. With none at all, cheaters and thieves will run rampant; with too much, motivation will be stamped out and progress will be stifled. There’s a happy medium somewhere in the middle, and finding it is (literally) the million-dollar question.
Application/impact: be aware that more of a good thing is not always better, whether in personal or professional applications.
Exponential Growth / Compounding
“Fermi allowed himself a grin. He would tell the technical council the next day that the pile achieved a k of 1.0006. Its neutron intensity was then doubling every two minutes.
Left uncontrolled for an hour and a half, the rate of increase would have carried it to a million kilowatts. Long before so extreme a runaway it would have killed anyone left in the room and melted down.”
The Making of the Atomic Bomb is a fantastic book that highlights nonlinearity in many of its forms: the obvious critical thresholds of radioactive materials, the exponential growthexample above and elsewhere, and the step-change in military power that accompanied the development of the atom bomb.
Here, the boring million-pound pile of raw materials – pictured to the right – went critical after the control rod was pulled out merely twelve inches. Thereafter, what started as merely a few neutrons doing their thing could’ve caused massive damage within the relatively short period of an hour thanks to the power of exponential growth.
There are a lot of famous anecdotes about exponential growth; one involves grains of rice on a chess board or being given one penny per day that doubles over a month.
My favorite example of exponential growth I’ve seen anywhere outside of the context of finance/investing comes from Meredith Wadman’s lovely “ The Vaccine Race” ( TVR review + notes). How could the finite-lived cells from a single culture tested rigorously for safety create enough cells – think, like, one petri dish worth of cells – be enough to manufacture decades worth of vaccines? Wadman explains, paraphrasing Leonard Hayflick’s 1961 paper “The Serial Cultivation of Human Diploid Cell Strains”:
“Suppose […] you began with just one small glass bottle […] measuring a mere 5.5 inches but not quite 3 inches. Such a bottle held roughly ten million cells when those cells had grown to confluence […] on its side.
[…] if at this point you split these newly planted cells into two bottles, and split the bottles again when the floors of those two bottles were covered, yielding four bottles; and if you then kept splitting the bottles […] until the original cell population had doubled fifty times […] the cells in that one original bottle would therefore produce twenty-two million tons of cells […] a potential ten sextillion cells.”
Even if you doubled the cell population only twenty times rather than fifty, since older cells start to display some potentially unwanted behaviors Wadman notes you would still:
“produce 87,000 times more vaccine than is made by a typical vaccine-making company, setting out today to make one year’s worth of a typical childhood vaccine that it will ship to more than forty countries.”
Pretty powerful, huh? It turns out this concept has utility in everyday life – let’s talk about bacteria.
“Imagine that a colony of bacteria created at 8:00 AM doubles in size every minute until noon. At what time will the colony of bacteria reach half its final size?”
Many people, using automatic “System 1” thinking, will take the midway point: 10:00 A.M. The actual answer is 11:59 A.M., which is easy to figure out if you think about it – bacteria double every minute, so by inversion, at any given time, there were half as many bacteria a minute ago. Scale drives this point home with plenty more explorations, such as GDP growth post the Industrial Revolution, and population growth.
But while the “theoretical” example is easy, real-world behavior still doesn’t take this into account. We’ll get into finance examples momentarily, but first, food: have you ever noticed that leftovers that were totally fine last night can smell horrible the next morning?
How and why does food go from “fresh, fresh, fresh” to “inedible” in what seems like a short span of time? The answer is exponential growth. Various sources, like the USDA, note that:
Bacteria grow most rapidly in the range of temperatures between 40 ° and 140 °F, doubling in number in as little as 20 minutes.
Other sources note that there are various kinds of bacteria, some of which even grow just fine below 40 F. Of course, as Geoffrey West notes in “ Scale” ( SCALE review + notes), temperature generally accelerates growth.
What’s the result of this? Well, even if we assume – randomly – that the bacterial load of food in your refrigerator doubles every two to four hours rather than every twenty minutes, there could be anywhere between 16x and 256x as many bacteria in your food by Friday lunch as there were at Thursday dinner.
The corollary to this is that leftover food shouldn’t be left out on the counter while you’re eating for the sake of convenience in case you want to heat up more: what seems, to us, like an unnoticeable degree of warming on the counter (hey, it’s still kinda cold to my hand) can result in a massive acceleration of the growth rate for bacteria – and thus, faster spoilage.
Similarly, if you’re having fajitas or burritos, don’t leave the whole container of sour cream or packet of cheese on the table for topping – take out as much as you think you’ll need, and put the rest back in the fridge!
In the context of finance and business, exponential growth is often known as “compounding,” and is one of the easiest-to-grasp but most-unintuitive concepts.
Below, for example, is a basic personal-finance example: let’s say that you want to retire at age 65 with $2 million in the bank. Let’s further assume that you can earn a ~7.2% annual real, after-tax rate of return.
How much do you need to have saved at any given point in your life such that, if you never save another dollar, you’ll reach your retirement goals? The answer:
|Age||Savings Amount Equivalent to $2MM @ age 65, Assuming ~7.2% Real, After-Tax Returns|
Unless you expect your income to go up by a factor of 4x from your early 30s to your early 50s (which would not align with published base rates on lifetime income trends for the vast majority of career fields), the implication is that saving more earlier makes a huge difference.
If you can set aside $25K/yr from age 25 – 35 – certainly tough, but doable for highly-educated professionals earning a good salary – then even if you don’t save another penny the rest of your life, you don’t have to worry about going hungry or homeless in your old age.
This analysis is highly oversimplified and ignores a lot of important factors, but that’s to illustrate the general point of compounding and exponential growth.
On the other hand, if you’ve saved little to nothing by the time you’re in your 40s or 50s (a surprisingly common challenge), you’re facing a much bigger uphill battle: accumulating a million or two over the course of ten years is well beyond the reach of most people who don’t earn well into the six figures.
Astonishingly, though, this kind of thinking doesn’t even always translate to those who should know this professionally. Megan McArdle’s “ The Up Side of Down” ( UpD review + notes) and Turney Duff’s “The Buy Side” (TBS review) both touch on this in various ways, and Sunstein/Thaler’s “ Nudge” ( NDGE review + notes) – particularly the “Save More Tomorrow” plan – provide a great intersection between compounding, structural problem solving, and hyperbolic discounting.
It’s worth noting, by the way, that even many financial professionals fail to truly appreciate the power of compounding. In the fantastic “ The Signal and the Noise” ( SigN review + notes), Nate Silver (of Five Thirty Eight fame) notes that one of the causes of the financial crisis was inadequate margin of safety in ratings- agency models due to a poor understanding of nonlinearity. An abridged version of Silver’s explanation:
Leverage can make the error in a forecast compound many times over, and introduce… nonlinear mistakes. Moody’s 50 percent adjustment was like applying sunscreen and claiming it protected you from a nuclear meltdown. – Nate Silver Click To Tweet
Application/impact: although things growing exponentially (like population of bacteria, or a company’s revenues) can look relatively linear over the short term, they are anything but over the long term, which can have important implications. Using a calculator (or Excel) to work out the math can serve as a powerful behavioral modification aid by making us more aware of things that we don’t “see.”
Power Laws / Economies of Scale
I’ll keep this section short (really!), first because I’m still learning about it, and second because I don’t feel like I have much value to add beyond Geoffrey West’s fantastic “ Scale” ( SCALE review + notes).
The final form of nonlinearity we’ll discuss is power laws, which can lead to physical economies of scale. What is a power law? It means that for every 100% increase in one variable, some other variable usually increases at some smaller or larger rate – 75% would be a 3/4ths exponent; 125% would be a 5/4ths exponent.
For example, surface area scales at the square of length, while volume (and weight) scales at the cube, such that surface area tends to scale at two-thirds the rate of volume.
Yeah yeah, boring math, why does this matter? Well, wake up: it matters because we need to know how much LSD to give an elephant. This is a real thing that scientists did once, and as West explains vividly, it went horribly wrong. (The elephant died.)
The researchers mistakenly assumed that drug dosage scales linearly: i.e., if there’s an appropriate mg/kg dosage for, say, a human, you just multiply that dosage by the weight of an elephant. Except there’s a problem: West notes that drugs are typically absorbed relative to surface area, such that the mg/kg dosage approach only works over very limited intervals.
This has practical consequences, of course. West discusses Tylenol dosage for children (I swear I came up with my example before reading his) on pages 54 – 55 of Scale:
“I recall many years ago being quite surprised when trying to console a screaming infant struggling in the middle of the night with a high fever to discover that the recommended dose of baby Tylenol, printed on the label of the bottle, scaled linearly with body weight […]
For example, for a 6-pound baby, the recommended dose was ¼ teaspoon (40 mg), whereas for a baby of 36 pounds, (six times heavier)[.] the dose was 1 ½ teaspoons (240 mg), exactly six times larger.
However, if the nonlinear ⅔ power scaling law is followed, the dosage should only have been increased by a factor of 6^(⅔) = 3.3, corresponding to 132 milligrams, which is just over half of the recommended dose!
So if the ¼ teaspoon recommended for the 6-pound baby is correct, then the dose of 1.5 teaspoons for the 36-pound baby was almost twice too large.”
West explores the fascinating way that these sorts of scaling laws apply to everything from our heartbeats and capillaries to the number of gas stations needed to serve a city and why Godzilla can’t actually exist (thank goodness).
In other contexts, power laws apply to architecture, ranging from HVAC requirements to the necessary structural strength for supporting buildings (see Henry Petroski’s “ To Engineer is Human” – TEIH review + notes – for more on this.)
This is speculative on my part and certainly the issue is multicausal, but these sorts of scaling laws may suggest why many “box” businesses like restaurants and retailers seem to overestimate their saturation point (linear thinking).
In any event, power laws are a topic I’d like to learn more about, because they clearly have lots of applications but aren’t something I’ve historically thought about a lot. Recommendations welcome!
Application/impact: while “linear” relationships like “GDP per capita” are popular and easy to understand, they may fail to account for scaling relationships.