The Twenty Minute VC cover
Technology & the Future

Cerebras CEO on the Future of Data Centres, Token Costs & Memory | Should US Companies Sell to China

The Twenty Minute VC

Hosted by Unknown

1h 8m episode
8 min read
5 key ideas
Listen to original episode

The CEO behind the largest semiconductor IPO says AI infrastructure isn't creating demand—it's chasing it, while a three-company memory monopoly quietly prints…

In Brief

The CEO behind the largest semiconductor IPO says AI infrastructure isn't creating demand—it's chasing it, while a three-company memory monopoly quietly prints 85% margins.

Key Ideas

1.

Demand drives AI infrastructure, not bubbles

AI infrastructure is chasing demand, not creating it — the opposite of every historical bubble.

2.

Three firms control HBM profit machine

Three companies control HBM memory; they're printing 85% margins with no relief for years.

3.

Inference speed creates multiplying competitive advantages

Speed in AI inference compounds: 6.7x faster doesn't mean 6.7x better — it means you get smoked.

4.

Lawyers structurally block enterprise AI adoption

Lawyers block enterprise AI more than bad data; their incentives structurally punish adoption.

5.

Recent layoffs correct COVID excess hiring

Most AI layoffs are COVID hiring corrections — real AI displacement is just starting.

Why does it matter? Because the AI infrastructure 'bubble' has cause and effect exactly backwards.

The CEO who just completed the largest semiconductor IPO in history walked in with a thesis that inverts the dominant narrative: AI infrastructure isn't racing ahead of demand — it's sprinting to catch up. That single distinction reshapes every investment assumption, enterprise strategy, and policy debate in the space.

• The $3–4 trillion infrastructure projection is demand-pull, not speculation — Cerebras alone carries $25 billion in backlog today • Three companies control the HBM memory every GPU needs; they're printing 85% gross margins with no supply response possible for years • Speed in AI inference compounds catastrophically — Feldman's verdict: "there will be zero market for slow" • The biggest enterprise blocker isn't messy data — it's lawyers operating on an incentive structure that makes "no" the only rational answer

The infrastructure buildout is chasing demand, not creating it — and that inverts the entire bubble thesis

Building is behind demand. That's the sentence that reframes everything.

Every historical bubble — fiber optics in the '90s, railroads before that — was characterized by infrastructure racing ahead of speculative future demand. AI is the exact opposite. Cerebras has a $25 billion backlog. Nvidia has a backlog. AMD has a backlog. The constraint isn't weak demand; it's the physical inability to build data centers fast enough to serve customers who want compute today, not in the future.

Construction delays aren't signals of a collapsing thesis — they're normal friction in large projects. Anyone who's renovated a kitchen understands contractors miss timelines. Data centers involve regulated utilities, local municipalities, and transformers that fall off trucks. Of course they're late. That's not a demand problem.

Sam Altman contracted for infrastructure before anyone believed the numbers. The ability to act on exponential demand projections "out a year or two or three is a superpower." The competitors who missed it weren't looking at different data. They just flinched.

Three companies make the memory every GPU needs — and they're earning software margins on hardware

Micron is posting 80–85% gross margins. On memory chips. That's what a structural monopoly looks like.

HBM — the high-bandwidth memory every GPU depends on — is produced by exactly three companies: Samsung, SK Hynix, and Micron. When demand exploded, they couldn't keep up. Prices shot through the roof. And because a new fab costs $40 billion and takes five years to build, there's no fast supply response possible. "If demand stays high, we are going to continue to see memory shortages for at least the next several years."

Cerebras's SRAM-based architecture sidesteps this entirely. TSMC etches SRAM directly into the logic chip — no HBM vendor collecting 80% margins, no co-ass constraints, no congested 3nm node. The memory shortage hammering GPU costs is structurally irrelevant to Cerebras's supply chain. Feldman calls it an "advantage" without any understatement.

Any business model projecting rapidly declining AI compute costs needs to account for this bottleneck. The fab economics make it a multi-year problem at minimum, and model efficiency gains don't move it.

There will be zero market for slow — speed advantage in AI inference is winner-take-all

Solve a hard problem in 3 minutes while your competitor takes 20. "Imagine over a day or a week — you get smoked."

The 6.7x speed advantage Cerebras demonstrated on Kimi K2 — posted in real time while an analyst was on television claiming they couldn't do it — doesn't mean 6.7x more value in isolation. In iterative work, the gap widens every cycle. "For hard problems, there is no upper bound to how much faster you want to be nor the value of speed."

How big is the market for slow search? Zero. How much would you accept to use dial-up at home — even $1,000 a month? You'd refuse. "There will be zero market for slow." The same logic extends to inference: for workloads where iteration speed matters — coding agents, agentic flows, research — hardware selection is a competitive capability decision, not a cost optimization. The slower option doesn't compete on price. It just loses.

2025 was the year AI crossed the usefulness threshold — and that single event drives everything downstream

Before 2025, it was "sort of a novelty. Cool and then nobody used it." The models simply weren't smart enough.

Early 2025 changed that. Once models crossed a genuine usefulness threshold, inference demand took off in a way that's demographic-spanning: Feldman's 85-year-old father, his 11-year-old niece, people deploying AI every day on harder and harder problems. "It is sweeping through different demographic groups." That breadth is what gives the demand curve its exponential shape — this isn't a Silicon Valley phenomenon anymore.

The one thing to watch isn't capex cycles or data center timelines. It's whether frontier models keep improving: "If we continue to find ways to make the frontier models smarter and more useful, the demand will continue on this sort of exponential curve." That's the actual demand driver. Everything else is downstream of it.

The biggest enterprise AI blocker is lawyers — their incentive structure makes 'no' the only rational answer

Not messy data. Lawyers.

The payoff structure for legal and security teams: "No credit, no credit, failure, blame. No credit, no credit, no failure, blame. That's their life and it's brutal." Confronted with novel technology and no established precedent, saying no is the only rational move. Lawyers live in "a business of backward-looking precedent." Ask one to work without any, and they freeze.

The organizational consequence: legal and data quality are sequential constraints, not parallel ones. Clean data only becomes the bottleneck once legal opens the gate. Jensen Huang reportedly had to mandate Cursor adoption directly when his own lawyers pushed back. That's the pattern Feldman expects to generalize — CEO-level decree clears the first obstacle, then data quality becomes the actual problem to solve.

Selling leading-edge chips to China is indefensible — and TSMC plus ASML are the viable choke points

"If we sell leading-edge technology to China, will their military use it? Everybody says yes. There is no debate on that point." The second question — will their government use it through industry to outcompete the US in an advantaged way? — also lands as yes. Feldman holds this position explicitly against his own economic interest.

The "they'll build their own anyway" counterargument fails structurally. The chip industry runs through TSMC, which runs through ASML. Those are real choke points. China's industrial playbook in solar, batteries, and EVs — government-subsidized competition at scale — shows the results: you travel the world and see Chinese cars, fewer American ones. Enabling the same dynamic in semiconductors isn't a trade-off worth debating. Feldman's preferred position: don't sell leading-edge, maintain the choke points, accept smaller revenue.

Most AI layoffs are COVID hiring corrections — and real displacement looks nothing like what we've seen

"I think to date most of the layoffs were AI washed." Feldman puts 90–95% of recent cuts on pandemic over-hiring and long-delayed productivity harvesting — middle management, information-gatherer roles automated by tools that predate AI. "Now AI is starting just now to have meaningful enterprise impact."

The tell is organizational appetite. At Cerebras, the engineering to-do list is "50 times as much as we have engineers." More productive engineers means more things get built — not fewer positions. What determines the outcome isn't AI capability — it's whether your organization has enough work to fill the capacity that productivity creates. Companies with constrained ambition will cut. Companies with expanding scope will hire.

Nvidia funded the neo-clouds to pressure hyperscalers — and customers pay compounded margins for it

Nvidia sells GPUs at 70–80% gross margins. Neo-clouds buy that hardware, then stack their own margin on top. First-party providers — Google with TPUs, Cerebras in its own data centers — skip the first layer entirely.

"I think it has been Nvidia's strategy to try and create competitors for the traditional hyperscalers. They have funded and backstopped and overallocated to the Neoclouds. They have created a dependence which is probably not healthy."

CoreWeave gets credit for genuine financial engineering — debt structures, rapid deployment, real innovation under pressure. But the arithmetic is the arithmetic. When mature first-party options exist, customers who want cheap compute without the bundled services will find them. The neo-cloud moat is thinner than it looks once you model the stacked margins.

The entire AI era rests on one variable — and it isn't infrastructure

Every argument Feldman makes — the demand explosion, the memory shortage, speed advantages compounding, enterprise adoption sweeping demographics — assumes frontier model capability keeps improving. He names the dependency explicitly: demand peaks if "AI stops improving in usefulness." Data center delays, HBM bottlenecks, legal friction — all of it is friction on a curve that either bends upward or it doesn't.

Watch the capability curve. Everything else is downstream.


Topics: AI infrastructure, semiconductors, Cerebras, HBM memory, data centers, inference speed, China chip policy, enterprise AI adoption, neo-clouds, Nvidia strategy, TSMC, AI bubble, IPO, AI employment, energy

Frequently Asked Questions

Is AI infrastructure creating demand or chasing it?
AI infrastructure is chasing demand, not creating it—the opposite of every historical technology bubble. In past cycles, infrastructure projects generated new uses, but AI demand already outpaces current supply. This reversal indicates more durable market fundamentals than typical speculative bubbles. The distinction matters strategically: it shows the AI infrastructure market responds to genuine demand rather than speculative fervor, reducing collapse risk. Unlike past technology cycles driven by infrastructure-created demand, sustainable AI infrastructure growth appears rooted in real, existing demand.
What are the key dynamics of the AI memory market?
Three companies control HBM memory and are printing 85% margins with no relief expected for years. This near-monopoly in critical AI infrastructure creates substantial structural pricing power and persistent high profitability that compounds over time. The significant technical barriers and limited competition mean AI system builders have constrained negotiating leverage, forcing them to accept elevated costs. This market concentration concentrates substantial profits with memory manufacturers while distributing costs across the broader AI infrastructure ecosystem, creating a meaningful structural advantage for the three dominant players.
How much does inference speed matter in AI competition?
Inference speed compounds non-linearly—6.7x faster doesn't mean 6.7x better performance, it means you "get smoked" by competitors. Speed advantages multiply across cost efficiency, query throughput, user experience, and overall competitive positioning. Faster inference enables substantially more operations per unit cost and better user engagement, creating cascading advantages throughout the organization. This non-linear compounding fundamentally transforms inference speed into a critical competitive moat where incremental improvements deliver outsized strategic returns, making speed optimization absolutely central to AI system viability.
What's the biggest barrier to enterprise AI adoption?
Lawyers block enterprise AI adoption more than bad data; their incentives structurally punish adoption of new systems. Legal and compliance concerns create significant institutional friction preventing AI deployment even when technically sound and viable. Organizational incentives strongly reward lawyers for preventing problems rather than enabling innovation, embedding risk-averse decision-making in AI implementation. Until legal frameworks and risk allocation mechanisms evolve to reward adoption alongside risk mitigation, enterprise deployment remains constrained by organizational liability concerns rather than technological capability or data quality.

Read the full summary of Cerebras CEO on the Future of Data Centres, Token Costs & Memory | Should US Companies Sell to China on InShort