
13588394_the-signal-and-the-noise
by Nate Silver
Most experts aren't just wrong—they're confidently, systematically wrong in ways that reveal a deeper flaw in how humans process data.
In Brief
The Signal and the Noise (Sept) examines why most predictions fail and how to make better ones. Drawing on fields from weather forecasting to poker, it teaches readers to distinguish meaningful patterns from random noise, apply probabilistic thinking, and update beliefs systematically — so they can evaluate expert forecasts more critically and reason more accurately under uncertainty.
Key Ideas
Calibration History Trumps Expert Credentials
When evaluating any expert forecast, check their calibration history—not their credentials. An expert whose 70% predictions come true 70% of the time is more valuable than one with a prestigious title whose confident calls are barely better than chance.
Explicit Priors Before Evidence Analysis
State your priors explicitly before examining new evidence. Ask: what did I already believe about this, and what's the base rate? A positive test result means very little without knowing how rare the underlying condition is.
Demand Uncertainty Distributions Over Point Estimates
Treat single-point forecasts with suspicion. A prediction of '49 feet' without an uncertainty range is a trap—ask for the distribution, not just the mean, especially when the downside of being wrong is asymmetric.
Foxes With Multiple Frameworks Beat Hedgehogs
Look for 'foxes' rather than 'hedgehogs' when seeking advice. The most trustworthy analysts combine multiple frameworks, openly revise their views, and are comfortable saying 'it depends'—the least trustworthy have one Big Idea they apply to everything.
Simple Models Outpredict Complex Overfitted Ones
Resist the urge to add more variables to a predictive model. More parameters fit past data better and predict future data worse. Ask whether you have a strong causal theory before adding complexity—otherwise you're fitting noise, not signal.
Incremental Probability Updates, Not Complete Replacements
In fast-moving situations, update incrementally. New evidence should shift your probability estimate, not replace it entirely. The question is never 'does this prove my hypothesis?' but 'how much should this move my estimate, given what I already knew?'
Who Should Read This
Science-curious readers interested in Behavioral Economics and Cognitive Psychology who want to go beyond the headlines.
The Signal and the Noise
By Nate Silver
13 min read
Why does it matter? Because the data explosion is making you worse at predicting, not better.
We have more data than any civilization in history—and our best experts predict the future about as well as a coin flip. That's not a technology problem. It's not even a knowledge problem. It's a confusion between two things that feel identical but aren't: how confident a forecast sounds and how likely it is to be right. The people who appear most certain on television turn out, when you actually track their predictions, to be the least accurate. The models that look most precise—decimal points, elaborate equations, impressive machinery—fail most catastrophically when reality stops cooperating. More information, counterintuitively, often makes this worse. What fixes it isn't more data or better computers. It's treating every conclusion as provisional—a working estimate, not a verdict. That skill has a name, a testable method, and a track record in the domains where it's actually been tried. And it's entirely learnable.
More Data First Made Us More Wrong—and It's Happening Again
Imagine a library that doubles in size overnight. Sounds like a gift—until you learn that half the new books are wrong, and there's no catalog to tell you which half. That is roughly what happened after Johannes Gutenberg's press began churning out pages in 1440.
Before Gutenberg, copying a manuscript ran about $200 per five pages in today's money. A single book-length text cost something like $20,000. Scribes made errors, errors multiplied through generations of copying, and knowledge decayed faster than it could be preserved. Then the press arrived, and the cost of a book collapsed to around $70—a 300-fold drop. Production exploded, growing roughly 30 times over the next century.
The result was not the Enlightenment. Not yet. The runaway bestsellers of the new information age were heretical tracts and pseudoscientific pamphlets. One edition of the Bible—the so-called Wicked Bible—went to press with a typo so spectacular it became infamous: 'Thou shalt commit adultery.' Errors that had once been handcrafted and rare could now be mass-produced and distributed continent-wide. The flood of new ideas didn't create consensus; it created the Thirty Years' War, which killed a third of Germany's population. The Enlightenment and the Industrial Revolution did eventually follow—but they arrived roughly 330 years after the press. Millions of deaths later.
Silver's point is that this is a pattern, not an accident. When computers spread through research labs and businesses in the 1970s and '80s, R&D spending per new patent application doubled rather than falling—rising from about $1.5 million to $3 million. More processing power, less useful output.
The Experts You Trust Most Are Specifically the Ones Most Likely to Be Wrong
The experts you see most on television are, statistically speaking, the worst people to ask about the future. That isn't a cynical guess—it's the finding of Philip Tetlock, a psychologist who spent fifteen years tracking roughly 28,000 predictions made by academics, government analysts, and professional commentators. His conclusion: these credentialed experts performed barely better than random chance (one audit of TV pundits found 338 predictions correct and 338 wrong—a coin flip wearing a suit). And the more famous the expert, measured by how often journalists called them for quotes, the worse their predictions tended to be. Media visibility and predictive accuracy don't just fail to correlate; they run in opposite directions.
The reason runs deeper than laziness. It's a thinking style. Tetlock divided his experts into two types, borrowing a line from the ancient Greek poet Archilochus: the fox knows many small things, the hedgehog knows one big thing. Hedgehogs—think of commentator Dick Morris, who predicted a McCain victory on the eve of Obama's ten-million-vote landslide—organize everything through a single governing idea. For Morris, Republican momentum. For a Marxist economist, class dynamics. For a free-market ideologue, incentives. Whatever the lens, it bends every piece of incoming evidence toward the same conclusion. Foxes, by contrast, hold multiple frameworks simultaneously, treat each new fact as a potential challenge to their existing view, and are genuinely comfortable saying 'it depends.'
The Soviet collapse is where you can see the difference most clearly. Liberals understood that Mikhail Gorbachev's reforms were sincere—he was genuinely loosening the system. Conservatives understood that the Soviet economy was rotting from within, shrinking by roughly 5 percent a year while the CIA was wildly overestimating its size. Each camp had a real piece of the truth. The foxes in Tetlock's study combined both pieces and got closer to forecasting the USSR's sudden end. The hedgehogs, locked inside their single ideological framework, missed it almost entirely.
The Rating Agencies Weren't Stupid—They Made One Lethal Assumption
October 2008. Deven Sharma, the head of Standard & Poor's, sits before the House Oversight Committee and explains why his agency missed the housing collapse. Nobody saw it coming, he tells Congress.
This was false. Google searches for "housing bubble" had increased tenfold between 2004 and 2005, with the heaviest traffic in the states that would soon see the worst crashes. Robert Shiller had been warning about a housing bubble since 2000. The Economist had called it the biggest bubble in history in 2005. S&P's own internal simulation from 2005 modeled a 20 percent national decline in home prices and concluded their models could handle it. They saw the bubble. They decided it didn't matter.
The actual error ran deeper than negligence or fraud, and this is what makes it so instructive. It lived inside a single mathematical assumption that looked reasonable and turned out to be catastrophic.
Here is the mechanics of it. Imagine five mortgages, each carrying a 5 percent default risk. You bundle them into a pool structured so that you only lose your investment if all five default simultaneously. What are the odds of that happening? The answer depends entirely on one thing: whether those defaults are connected to each other or not.
Assume the mortgages are independent—a carpenter in Cleveland losing his job has no bearing on a dentist in Denver falling behind on payments—and the math produces something almost miraculous. Five separate 5 percent risks multiplied together yields roughly one chance in three million. Rating agencies loved this number. It let them stamp AAA on products built from subprime debt.
Now change the assumption. Suppose instead that a single force—say, a nationwide housing collapse—can pull all five borrowers under at once. Now you don't get five separate dice rolls; you get one. The risk of total loss jumps to 5 percent, making the same bet roughly 160,000 times more dangerous than the first calculation suggested. Same mortgages. Same math. Different assumption.
This is exactly what happened. S&P's models predicted roughly 1-in-850 odds that their top-rated securities would fail. The actual default rate came in at 28 percent—more than 200 times worse than forecast. The models weren't sloppy; they were precise to the second decimal place. They were just measuring something that didn't exist: a world where housing markets in Florida and California and Ohio could fail independently of one another. When the bubble burst, every market fell together, and the 160,000-fold errors followed as if they'd been bolted to the floor all along.
When You Strip Out Uncertainty, You Turn a Forecast Into a Trap
There's an old statistician's joke: a man drowns crossing a river that was only three feet deep on average. Averages don't get your furniture wet. Distributions do.
In April 1997, Grand Forks, North Dakota made that joke literal. Residents had known for months that the Red River was going to flood. The National Weather Service predicted a crest of forty-nine feet. The town's levees stood at fifty-one feet. With two feet of margin, most people figured they were fine—so few bought flood insurance, no one piled sandbags, and no one moved to divert flow toward farmland. When the river crested at fifty-four feet, it overtopped the levees and swallowed the city. Nearly all 50,000 residents evacuated. Three-quarters of the homes were damaged or destroyed.
Here is the part that makes the disaster structural rather than just unlucky: the forecast was actually pretty good. A five-foot miss, two months out, was roughly in line with the Weather Service's historical accuracy. The real margin of error on that prediction was plus or minus nine feet, which meant there was about a 35 percent chance the levees would be overtopped. The forecasters knew this. They chose not to say it, because they worried that expressing any uncertainty would undermine public confidence in the prediction.
The effect was the opposite of what they intended. Residents heard forty-nine feet and processed it not as a probability distribution but as a fact—and because the levees stood at fifty-one, some people actually treated forty-nine as the maximum the river could possibly reach. Strip out the uncertainty and you don't give people a number. You give them a false ceiling.
Weather Forecasters Cracked the Code—Here's What Everyone Else Gets Wrong
What would it actually look like to forecast well? Not just to be right sometimes, but to demonstrate, systematically, that your confidence tracks reality? Weather forecasting gives you the answer, and it's more demanding than most experts in other fields would be comfortable with.
The standard is called calibration. Every time the National Weather Service says there's a 40 percent chance of rain, does it actually rain about 40 percent of those times? Check the records: yes, it does. When they say 20 percent, it rains roughly 20 percent of the time. They make tens of thousands of forecasts a year, which means the feedback loop is tight—errors surface quickly and get corrected. That's the benchmark: not sounding confident, not being right on the dramatic calls, but having your stated probabilities match outcomes across hundreds of repetitions.
Now look at what happens when incentives point somewhere else. The Weather Channel's meteorologists understand calibration—they're serious scientists—but they've made a deliberate choice at the low end of the probability scale. When their models show roughly a 5 percent chance of rain, they'll tell viewers 20 percent instead. The reasoning is explicit: if it rains unexpectedly, viewers blame the forecaster; if the forecaster cries rain and the sun comes out, viewers shrug and enjoy the day. So accuracy gets quietly traded away to manage reputation. They're not lying about the 70 percent calls—those hold up fine—but they've built a systematic distortion into the range where it's hardest to notice.
That distortion tells you everything about what goes wrong in forecasting outside meteorology. The failure mode isn't usually incompetence; it's misaligned incentives dressed up as precision. A forecaster who wants to protect their brand, please an audience, or avoid looking wishy-washy will adjust their stated confidence away from their actual belief—and once that gap opens, the whole enterprise drifts from information toward performance. Calibration is the test that catches it.
Bayes's Theorem Is Just Common Sense Written in Math—But It Changes Everything
Think of your beliefs as a running estimate rather than a verdict. You don't flip a coin and declare the result permanent—you update it as new information arrives.
Here's the test case that makes the logic undeniable. You come home from a trip and find an unfamiliar pair of women's underwear in your dresser drawer. Your gut says: caught. But work through it carefully. Roughly 4 percent of married partners cheat in any given year—that's your prior, before you found anything. Now you update. If your partner were cheating, how likely would you be to find direct evidence like this? Maybe 50 percent—cheaters are sometimes careless. But if he's innocent, could the underwear have appeared anyway? Mixed-up luggage, a friend who stayed over, a forgotten gift—collectively maybe a 5 percent chance. Run those three numbers through Bayes's formula and the posterior probability—your updated estimate after finding the underwear—comes out to 29 percent. Not 4 percent, but nowhere near the near-certainty your instincts declared. The underwear is real evidence. It just isn't overwhelming evidence, because you started from a low base.
R.A. Fisher, the statistician whose methods still govern most published research, built his framework to exclude prior probabilities entirely. The practical result: when a researcher tests whether, say, the bank prime loan rate predicts Alabama unemployment, any positive result looks like signal, because there's no accounting for how unlikely a real discovery was to begin with. Most potential relationships don't exist. Strip out the prior and you can't tell noise from finding. That's why John Ioannidis ran replication studies and found that roughly two-thirds of published results couldn't be confirmed—not fraud, just a framework that systematically mistakes coincidence for discovery.
Bayes's fix isn't to stop looking at evidence. It's to bring two things to every piece of it: an honest estimate of how probable your hypothesis was before you looked, and a willingness to update incrementally rather than leap to conclusions. Sports bettor Bob Voulgaris did exactly this when he put his entire $80,000 savings on the Lakers winning the 1999–2000 NBA title. The bookmakers had them at 6.5-to-1, implying roughly a 13 percent chance. Voulgaris thought the real probability was closer to 25 percent—not certain, not even likely, just substantially better than the market price. He had watched the games. He knew Kobe Bryant was injured and would return. He knew Phil Jackson's system took time to install. The slow start was noise; the underlying team was elite. That gap between 13 percent and 25 percent, multiplied by the payout, was the whole bet. He wasn't predicting a championship. He was saying the odds were mispriced, and by how much.
The Overfitting Trap: Why Your Model Gets Smarter on Paper and Dumber in Reality
A model that fits the data better is always a better model. This feels like logic. It is actually one of the most reliable paths to being spectacularly wrong.
John von Neumann, who co-invented game theory and helped build the first computers, put his finger on the problem: given four parameters, he could fit a curve to an elephant. Give him five, and he could make the trunk wiggle. He meant it as a warning. The more adjustable parts you add to a model, the more perfectly it can trace the historical record—and the more useless it becomes for predicting anything you haven't already seen. You're no longer finding the structure in the data. You're memorizing the noise.
Japanese seismologists before 2011 demonstrated what this costs. No earthquake of magnitude 8.0 or higher had struck the region near the eventual Tohoku epicenter in roughly forty-five years. So planners used what seismologists call a characteristic fit—a model contoured tightly to recent history—and concluded that a magnitude 9-class event was essentially impossible. The Fukushima plant was built to survive an 8.6. The earthquake that hit registered 9.1. The characteristic fit had matched the recent record beautifully. It just treated a forty-five-year silence as evidence of a structural ceiling rather than a run of ordinary luck. A once-per-thirty-year event failing to appear in forty-five years is no more remarkable than a solid hitter going zero for five. The simpler Gutenberg-Richter law, which ignores local quirks and follows the global pattern, estimated the catastrophic quake at roughly a one-in-three-hundred-year event. That's uncommon, not impossible—close enough that a wealthy nation might reasonably prepare.
The honest complication is that simpler models can fail too, and complexity is sometimes genuinely warranted. The problem is that forecasters tend to reach for it regardless, because overfit models score better on standard statistical tests and look more impressive in journals. They win the audition and fail the performance.
Deep Blue Beat Kasparov Because Kasparov Was Too Smart
Garry Kasparov is reviewing a single chess move in a hotel room at the Plaza, and he is getting scared. It's the evening after Game 1 of the 1997 rematch against IBM's Deep Blue, and Kasparov can't explain what the computer did on its forty-fourth turn. Losing badly, Deep Blue had slid a rook to an apparently useless square—not checking the king, not threatening any piece, not advancing any visible plan. Then it resigned a turn later. Why waste a move? The only explanation Kasparov could accept was that the machine had seen something he couldn't: some calculation twenty moves deep that made surrendering that moment worthwhile. He went to bed convinced he was facing an intelligence beyond his comprehension.
The move was a bug. Deep Blue's code had glitched, failed to select any legal move, and defaulted to picking one at random. The engineers knew immediately and patched the program the next morning. The move meant nothing.
But because Kasparov was exactly the kind of mind that finds patterns in everything, he could not entertain the null explanation. In Game 2, still haunted by the phantom depth he'd seen in that rook move, he resigned a position his own analysts later showed he could have drawn. He never beat Deep Blue again.
The lesson isn't that computers are better than humans. The qualities that make human forecasters powerful—the drive to find meaning, to read an opponent, to construct a narrative from sparse evidence—can turn lethal when the signal you're reading is noise. Kasparov's pattern recognition didn't fail because it was weak. It failed because it was strong enough to manufacture a pattern where there was none.
The frontier in forecasting isn't picking a side in the human-versus-machine debate. It's knowing which tool to pick up when—and recognizing that the same instinct pushing you toward a confident read may be the one most worth questioning when the data refuses to cooperate.
The Signal Was There Before 9/11—The Problem Was Imagination, Not Data
What if the failure before Pearl Harbor wasn't that we lacked the signal—but that we were listening for the wrong one? The U.S. had decoded 97 percent of Japan's diplomatic traffic. Analysts knew the carrier fleet had gone silent, which was itself alarming. But rather than sit with the discomfort of not knowing where those ships were, intelligence officials reached for a familiar explanation: the carriers were near home waters, using land-based communication. The disturbing possibility—that an entire fleet had slipped into attack position—went unconsidered. The evidence was there. The mental model refused to hold it.
Silver calls this mistaking the unfamiliar for the improbable. Thomas Schelling, the Nobel-winning economist, identified the pattern: when something lies outside our experience, we don't evaluate it as unlikely—we fail to register it as a possibility at all. Rumsfeld put a name to it in a different context: unknown unknowns. Not the gaps we know we have, but the gaps we don't know we have. The first kind you can work on. The second kind is the one that kills you.
Aaron Clauset's finding about terrorism completes the picture. Plotted on a log scale, terrorist attacks form a straight line—the same power-law signature as earthquakes. A log scale compresses the axis so that each step up represents ten times the magnitude; a straight line on that scale means small attacks and catastrophic ones follow the same underlying distribution. Which means a 9/11-scale event wasn't a black swan lurking beyond statistical imagination. Using data from before 2001, the power law suggests such an attack was due roughly once in a lifetime. The math knew what our mental models refused to hold.
The disposition Silver lands on isn't a formula. It's a stance: carry more hypotheses than feel comfortable, assign small but real probabilities to scenarios that seem alien, and update when the evidence shifts. The signal is almost always already there.
The Forecast That Stays Honest
The Grand Forks forecasters knew the river might reach fifty-four feet. They said forty-nine, cleanly, because a clean number felt more trustworthy than an honest one. What they got was a city underwater and residents who'd had every reason to prepare and none of the information they needed to do it. That's the cost of false precision—not just a bad number, but a number that closes off the very questions that might have saved you.
The foxes, the calibrated ones, the forecasters who tell you forty percent and actually mean it—they're not less certain than the hedgehogs. They're certain about something harder: not the answer, but the process of holding it loosely enough that new evidence can actually change it.
Voulgaris understood this. When he bet on a game he'd assessed at twenty-five percent, he wasn't pretending the odds were better than they were. He knew he'd lose that bet three times out of four. He placed it anyway, because twenty-five percent isn't nothing, and because a bettor who flinches at honest uncertainty will only ever bet on sure things—which in practice means never betting at all. The discipline wasn't confidence in the outcome. It was confidence in the number.
That's the disposition the book is really describing. You won't always be right. But if you state your priors, name your uncertainty, and stay genuinely open to being wrong, reality at least has a way in.
Notable Quotes
“The fox knows many little things, but the hedgehog knows one big thing.”
“nothing new under the sun,”
“putting opponents on a hand,”
Frequently Asked Questions
- What is The Signal and the Noise about?
- The Signal and the Noise examines why most predictions fail and how to make better ones. Drawing on fields from weather forecasting to poker, it teaches readers to distinguish meaningful patterns from random noise, apply probabilistic thinking, and update beliefs systematically. By mastering these approaches, readers can evaluate expert forecasts more critically and reason more accurately under uncertainty. The book's central premise is that learning to separate signal from noise—genuine patterns from meaningless fluctuation—is essential to making sound decisions in an information-rich world.
- What are the key takeaways from The Signal and the Noise?
- The book teaches six core principles for better prediction. First, check calibration history—not credentials—when evaluating forecasts. Second, state your priors explicitly and understand base rates before examining evidence. Third, "Treat single-point forecasts with suspicion. A prediction of '49 feet' without an uncertainty range is a trap—ask for the distribution, not just the mean." Fourth, look for analysts who combine multiple frameworks and openly revise views rather than apply one idea universally. Finally, avoid adding unnecessary variables to models and update incrementally as new evidence arrives.
- How should you evaluate expert predictions according to The Signal and the Noise?
- Calibration history trumps credentials. An expert whose 70% predictions come true 70% of the time is more valuable than one with a prestigious title whose confident calls are barely better than chance. Additionally, always identify your priors before analyzing new evidence—ask what you already believed and what the base rate is. "A positive test result means very little without knowing how rare the underlying condition is." This systematic approach prevents overconfidence and anchoring bias, enabling you to distinguish between truly skilled forecasters and those who simply benefit from favorable circumstances or persuasive communication styles.
- Why does The Signal and the Noise warn against single-point forecasts?
- Single-point forecasts mask the true distribution of uncertainty. When someone predicts '49 feet' without an uncertainty range, you lose critical information about possible outcomes and their probabilities. The book advises asking for the distribution, not just the mean, because decisions often involve asymmetric risk—where downside and upside carry different consequences. Understanding the full range of likely scenarios matters especially when being wrong costs more in one direction than another. By thinking probabilistically about distributions rather than point estimates, you can make decisions that account for worst-case scenarios and opportunities alike.
Read the full summary of 13588394_the-signal-and-the-noise on InShort


