
Anj Midha on Investing $300M into Anthropic & How 21 of 22 VCs Rejected It | China is Winning in AI?
The Twenty Minute VC
Hosted by Unknown
21 of 22 VCs passed on Anthropic because they'd never heard of GPT-3 — a miss that may define venture capital's most embarrassing chapter.
In Brief
21 of 22 VCs passed on Anthropic because they'd never heard of GPT-3 — a miss that may define venture capital's most embarrassing chapter.
Key Ideas
VCs missed foundational model importance
21 of 22 VCs passed on Anthropic's seed because they'd never heard of GPT-3.
Full-stack systems offset hardware gaps
China wins AI without better chips by distilling Western models and competing on full-stack systems design.
GPU wastage, not AI overvaluation
There's no AI bubble — there's a GPU wastage bubble from non-fungible, unstandardized compute.
Culture is the master bottleneck
Culture is the master bottleneck: solve it and algorithms, data, and recruiting solve themselves.
Supply access determines inference winners
The inference winner is whoever secures compute supply — not whoever builds the best product.
Why does it matter? Because the biggest AI bet in history got 21 rejections from the people whose job was to see it coming.
Anj Midha introduced Anthropic to 22 investors on Sand Hill Road and walked away with 21 nos — from VCs who had never heard of GPT-3. What this conversation reveals is how completely the venture establishment missed the defining opportunity of their careers, why China is winning the AI race without needing better chips, and what's actually causing the "AI bubble" talk (hint: it's not the capabilities).
- The 22 VCs who passed on Anthropic's seed aren't an anomaly — the same pattern of institutional blindness is repeating right now in whatever the current misunderstood frontier is
- China has already figured out how to reach AI parity without matching US chips, making export controls largely beside the point
- There is no AI capabilities bubble — there is a GPU wastage bubble caused by non-fungible, unstandardized compute sitting idle
- Inference is a supply chain competition disguised as a product competition, and most investors funding inference startups are destroying capital
21 VCs asked 'where's the proof?' — and didn't know what GPT-3 was
"What's GPT-3?" That's what Midha heard after pitching the founders who invented it. He introduced Dario Amodei, Tom Brown, and the rest of the Anthropic team to 22 investors up and down Sand Hill Road in early 2021. Twenty-one said no.
The asks kept shrinking. Anthropic originally tried to raise $500 million. They had to reanchor to $100 million — a seed round that "at the time felt like a lot" but was tiny compared to the billion OpenAI had already raised. In the end, zero traditional VC firms participated in the seed round. The investors who did get it were either ML researchers with effective altruist overlap (SBF), or corporations watching competitive threats to their own business (Amazon, which eventually committed $4 billion in exchange for AWS hosting).
Midha's read: the gatekeepers of capital were systematically blind to the most important technology wave of their careers. The pattern isn't unique to Anthropic — it's structural. VCs optimized for pattern-matching against the last decade are constitutionally bad at the earliest stages of frontier technology cycles. The Anthropic seed didn't get funded because of VC conviction. It got funded despite the lack of it.
The implication Midha doesn't say directly but the data makes obvious: stop outsourcing AI conviction to consensus venture opinion. The firms that passed on Anthropic are the same firms LPs are still allocating to. "How many venture capital firms were in the seed round of Anthropic? Oh, none. That's the answer for you."
China is winning the AI race through full-stack systems design — not better chips
Sixty billion dollars of US chip export controls, and China found the backdoor: they're not trying to win the chip race. They're winning the systems race.
Midha lays out the strategy with unusual precision. Chinese labs co-design their available chips — primarily Huawei — with the entire infrastructure stack, squeezing performance gains at every layer that partially offset the hardware gap. Simultaneously, they run adversarial distillation at scale: hitting Western model endpoints from various access points, extracting state-of-the-art capabilities, and feeding that data into their own training runs. They release the outputs as open-source models, collect global feedback, and iterate. When they get close enough to the frontier, the open-sourcing stops — it was always just a bootstrapping mechanism.
The result: "Huawei chips are able to produce capabilities improvements today in China that rival some of the best chips here when integrated up and down the stack." It's the Google strategy — vertically integrated from land-power-shell through training runs to deployment — replicated with adversarial data acquisition substituting for proprietary research.
Midha is explicit that this concerns him deeply. His proposed defense is an "Iron Dome" for Western inference: a shared proxy layer across all frontier labs that can detect and coordinate responses to distillation attacks in real time. Right now, he's doing this informally — group chats with founders comparing distillation spike patterns from suspicious regions. "It's very informal right now, but what we need is" something far more systematic. Export controls are fighting the supply battle. The real vulnerability is inference endpoints being harvested, and no institution is coordinating the defense.
There is no AI bubble — there's a GPU wastage crisis caused by compute nobody can share
$40 a server-hour panic headlines aside, Midha draws a sharp line: "We are not in an AI crisis. We are not in an AI bubble for sure. We are definitely in a GPU wastage bubble."
Billions of dollars of compute is sitting idle right now, and the reason is almost embarrassingly mundane: chips don't talk to each other. H100s can't interoperate with GB200s. Within a single manufacturer — Nvidia — chip generations are completely incompatible. If you provisioned a cluster with H100s two to three years ago and now want to run workloads for newer generation models, you're memory-bound. The newer architecture's benefits are locked out. You can't unlock them without buying an entirely new cluster, so the old one just sits there, stranded.
Midha's analogy: we are in the pre-standardization era of compute today, which was the pre-standardization era of electricity in 1885 — before AC/DC was settled, before megawatts were fungible, before anyone could plug a shoe factory and a steel factory into the same grid. Every infrastructure cycle has gone through this: electricity, steel, railroads. Each produced boom-bust cycles, corporate backstabbing, and resource hoarding before standards emerged.
The "AI bubble" conversation is a misdiagnosis. The capabilities are extraordinary across every domain. What's broken is the infrastructure layer beneath them — no open protocols for flops to flow across chip types, no institutions enforcing standardization. The company that creates the TCP/IP of compute infrastructure captures enormous value. Nobody has done it yet.
Culture — not algorithms, data, or compute — is the master bottleneck for AI progress
Four bottlenecks drive AI progress: context feedback loops, compute, capital, and culture. Midha thinks the last one controls all the others.
His argument: algorithmic innovation is a downstream output of culture, not an independent variable. If you build a mission-driven culture that attracts truth-seeking researchers, those researchers aren't locked into defending one architecture over another. They're not "all in on transformers versus diffusion models" — they just want to solve the problem. The best scientists pursue the mission, and the algorithmic breakthroughs fall out of that. Two to three years ago, figuring out which architectures scaled was a genuine bottleneck. Midha's current view: solve the culture problem, and you've solved the research and algorithmic problem.
This reframes how to evaluate AI companies. The benchmarks and technical evals everyone obsesses over are lagging indicators. The leading indicator is whether an organization has built a culture resilient enough to hold its mission under commercial pressure — and whether that culture continues to attract researchers who won't compromise on truth-seeking.
His description of what makes Dario Amodei exceptional hits the same three notes: world-class technical ability, obsessive empiricism ("he's a physicist at heart — he derives general laws of reality by looking at data"), and mission alignment with no drift. "We won't take shortcuts. We are willing to make huge trade-offs to hit this mission." That combination attracts talent that compounds. The institutions that hold this under pressure are the ones worth betting on long-term. "AI alignment, don't get me wrong, is hard but not the hardest problem. Human misalignment is really the problem right now."
AI was 'terrible' at physics and chemistry a year ago — the moat in science AI is who controls the physical data loop
A year ago, as a visiting scientist at Stanford's applied physics department, Midha started benchmarking frontier models on scientific analysis. Claude, Gemini, the rest. "Surprise — they sucked. They were so bad."
The marketing said AI was revolutionizing science. The reality: models were starting to get good at code, and completely falling apart on physics and chemistry. The gap wasn't a model architecture problem. It was a data problem with a physical lock on it. Physics and chemistry data doesn't live on the internet. It lives in national labs, academic labs, semiconductor manufacturing plants — institutions that don't publish training data. The internet is mostly blogs and code. You cannot scrape your way to materials science.
Midha's solution at Periodic Labs: build a 30,000-square-foot physical facility in Menlo Park where LLMs predict new materials and superconductors, robots synthesize them, X-ray diffraction machines physically validate whether the predicted properties hold, and the verification data pipes back into the training run. The model improves on real experimental outcomes, not internet text.
Scaling laws are not saturating here. Midha says throwing more compute at superconductor discovery is producing "super exponential gains" per iteration right now. The bottleneck wasn't compute or algorithms — it was that the ground-truth data was locked behind physical experiments nobody was running at scale.
The vertical AI moat isn't fine-tuning. It isn't prompt engineering. It's controlling the physical feedback loop that generates data no one else can access or purchase.
Venture capital is reverting to its founding model — and firms that don't co-found will get commoditized by software
Midha describes a "back to the future" moment for venture. Not metaphorically — he means the Intel/Arthur Rock model, the Genentech/Bob Swanson model, where the investor was embedded in the company daily, wrote the stock incentive plan, ran all-hands, and functioned as a co-founder with a checkbook.
His evidence that this is his actual operating model: he works at Periodic Labs three days a week. Every morning from 8:00 to 8:30 a.m. for the past year, he and CEO Liam Do have a daily standup to work through company priorities. The AMP compute team sits upstairs procuring infrastructure for Periodic. This isn't a board seat — it's closer to a co-founder arrangement.
The reason the old check-writing model is breaking down is both structural and technological. Structurally, the frontier technology cycle requires the kind of capital-plus-operational partnership that only co-founders provide. Technologically, software is absorbing the coordination functions that justified the traditional VC role. "When you have software that can play many of the coordinating roles of venture capital firms, why do you need somebody who's just a pure rapper on LPs?"
Midha's verdict on firms that don't adapt: "I think if they don't evolve themselves for what entrepreneurs of this era need, then I think they should get out of the venture capital business." The firms that survive are the ones building alongside companies. The ones that don't are being slowly replaced by platforms like Robinhood Ventures and whatever comes after it.
The inference race is already decided — it goes to whoever controls compute supply, full stop
Every inference company is calling Midha. Not because he's good at picking inference products — because he has compute.
"Supply access to supply. It's that simple. Compute supply. If you don't have compute, how do you do inference, man? If you're making a steam engine, you need coal." Inference is being framed by investors as a product competition — better latency, lower prices, smarter routing. It's actually a supply chain competition. The hyperscalers hoarded compute. Independent inference providers can't get enough of it to actually serve their product. The best teams are calling Midha because their product is, at base, reselling compute — and the compute has been locked up by players who aren't innovating.
The VC response to this dynamic has been to fund more inference companies, which makes the problem worse. Fifty inference startups all competing for the same scarce compute means the companies actually innovating can't get what they need for their next product cycle. "It is not clear to me that we need 50 inference companies. And it's not clear to me that VCs are smart enough to realize that they're just lighting hundreds of millions of dollars on fire."
The correct number is probably four or five. The determining variable between winners and losers isn't team quality, pricing strategy, or product design. It's who secured compute access before it got hoarded. That race is largely over.
The real signal from this episode: the next Anthropic is already being missed
Every pattern Midha describes — the 21 rejections, the China distillation strategy, the compute hoarding, the culture bottleneck — points to the same underlying dynamic: institutional actors consistently misread frontier technology cycles until the window has closed. The VCs who passed on Anthropic aren't embarrassed anomalies. They're the modal response. The same blindness is operating right now in whatever category currently looks too capital-intensive, too technically opaque, or too far from consensus.
The one thing this episode didn't fully spell out: the entity that builds the fungibility layer for compute — the TCP/IP of the AI infrastructure stack — captures more value than most frontier model companies. Nobody has built it yet. That's the open territory.
Topics: AI investing, Anthropic, venture capital, compute infrastructure, China AI, scaling laws, frontier AI, inference, AI safety, sovereign AI, open source models, AI for science
Frequently Asked Questions
- Why did 21 of 22 VCs reject Anthropic's seed funding?
- 21 of 22 VCs passed on Anthropic's seed because they'd never heard of GPT-3, a miss that may define venture capital's most embarrassing chapter. This reflects a critical knowledge gap among investors about foundational AI technology. The single VC who invested recognized what others missed, creating one of the most significant valuation misses in investment history. This demonstrates how limited technical understanding and information asymmetries can cause venture capital to overlook transformative opportunities, particularly in emerging technology spaces where cutting-edge knowledge becomes essential.
- How is China winning in AI without better chips?
- China wins AI without better chips by distilling Western models and competing on full-stack systems design. Rather than attempting to compete on raw computational power or novel architecture, Chinese companies extract value from existing Western AI models through knowledge distillation techniques and then outcompete on integrated system design. This strategy leverages operational excellence and systems integration capabilities. By focusing on end-to-end optimization rather than individual components, China demonstrates that AI leadership isn't solely determined by hardware capabilities, but by how effectively all elements work together.
- Is there an AI bubble or a GPU wastage problem?
- There's no AI bubble in the fundamental sense—there's a GPU wastage bubble from non-fungible, unstandardized compute. The core issue isn't AI adoption or capability limitations, but rather inefficient allocation of computing resources. Companies purchase GPUs that aren't interchangeable or optimized for their specific workloads, leading to widespread waste across the industry. This fragmentation in compute standards creates artificial scarcity and inefficient capital deployment. Understanding this distinction is critical: the actual value creation in AI remains solid, but infrastructure deployment lacks standardization and optimization.
- What determines who wins in AI inference?
- The inference winner is whoever secures compute supply—not whoever builds the best product. This counterintuitive insight suggests that competitive advantage in AI inference depends primarily on access to and control over computational resources rather than superior algorithms or model quality. Product differentiation becomes secondary to supply chain control and GPU availability. This reframes AI competition from a technology race to a logistics and supply chain competition, where companies that can reliably access and efficiently deploy computing hardware gain decisive advantages over those with technically superior but compute-limited solutions.
Read the full summary of Anj Midha on Investing $300M into Anthropic & How 21 of 22 VCs Rejected It | China is Winning in AI? on InShort
