The Three Bottlenecks Standing Between Us and the AI Compute Explosion

Somewhere in the Netherlands, a machine the size of a school bus fires a laser at a droplet of tin, vaporizing it into a plasma that emits light at precisely 13.5 nanometers — a wavelength so extreme it can only travel through a vacuum. This process repeats 50,000 times per second. The machine costs $400 million. There are only about 300 of them in the world. And everything — the $600 billion in Big Tech capital expenditure this year, the trillion-dollar AI infrastructure buildout, the race to artificial general intelligence — bottlenecks through this single, impossibly complex device.

Dylan Patel, CEO of SemiAnalysis, the research firm that has become the go-to source for granular data on AI infrastructure, sat down with Dwarkesh Patel on the Dwarkesh Podcast to lay out a comprehensive picture of what's actually constraining the scaling of AI compute. The conversation ranged from Anthropic's compute crunch to the physics of EUV lithography, from memory pricing destroying smartphone markets to why space data centers don't make sense this decade. What emerged is a remarkably detailed map of the chokepoints that will determine the pace of AI progress for the rest of the decade.

The Capex Tsunami: Where Is All That Money Going?

The numbers are staggering. Amazon, Meta, Google, and Microsoft are forecasted to spend a combined $600 billion in capital expenditure this year. Across the rest of the supply chain, that figure approaches a trillion dollars. But as Patel explains, this money isn't all going to compute that comes online in 2025.

"When we're talking about 20 gigawatts this year in America, roughly incremental added capacity, a portion of this is not spent this year. A portion of that capex was actually spent the prior year. And so when you look at, hey, Google's got 180 billion dollars, actually, a big chunk of that is spent on turbine deposits for 28 and 29. A chunk of that is spent on data center construction for 27."

The picture that emerges is one of companies making massive forward bets — purchasing turbines, signing power agreements, and breaking ground on facilities that won't be operational for years. It's a supply chain operating on multiple time horizons simultaneously, with today's spending aimed at building the infrastructure for tomorrow's AI explosion.

The Lab Wars: Anthropic's Conservative Bet and OpenAI's Aggression

Perhaps the most revealing section of the conversation concerned the contrasting strategies of the two leading AI labs. OpenAI, under Sam Altman, has been aggressively signing compute deals with every provider it can find — from Microsoft and Google to CoreWeave, Oracle, and even SoftBank Energy, a company that had never built a data center. Anthropic, under Dario Amodei, took a more conservative approach.

"Dario, when he was on your podcast was very, very, like, conservative. He's like, 'You know, I'm not gonna go crazy on compute, because if my revenue inflects at a different rate, at a different point, I don't wanna go bankrupt.' But in reality, you know, he's definitely missed the pooch in terms of, like, going like OpenAI, which was, 'Let's just sign these crazy fucking deals.'"

The consequence of this conservatism is now biting. Anthropic's revenue has been exploding — adding $4 billion in January and $6 billion in February — but the company doesn't have enough compute to serve all the demand. Patel estimates Anthropic needs to get to five or six gigawatts by year's end, well above its initial plans, and will have to go to lower-quality providers or rely on revenue-sharing arrangements through Amazon's Bedrock, Google's Vertex, and Microsoft's Foundry.

The financial implications are significant. Patel has seen deals where AI labs are paying as much as $2.40 per GPU-hour for H100s on two-to-three-year contracts — far above the original $1.40 cost to deploy. Companies that locked in long-term contracts early, as OpenAI did, have secured a massive margin advantage.

The Paradox of GPU Depreciation

One of the conversation's most counterintuitive insights concerns how GPU values are evolving. The conventional wisdom — articulated by bears like Michael Burry — holds that rapid hardware improvements should cause GPU values to depreciate quickly. But Patel argues the opposite is happening.

"An H100 is worth more today than it was three years ago."

The reasoning is elegant: as models improve, they extract more value from the same hardware. GPT-5.4 can serve more tokens per GPU than GPT-4 could, and those tokens are from a vastly superior model. In a supply-constrained world, what prices a GPU isn't the next-best alternative you could buy — it's the value you can derive from it today. And that value keeps going up as models get better.

Patel connects this to an economics concept called the Alchian-Allen effect: when fixed costs increase across the board, consumers shift toward higher-quality goods because the relative price difference shrinks. Applied to AI, as GPU costs rise, users become even more willing to pay premiums for the very best models, concentrating revenue at the frontier labs.

Bottleneck #1: Logic — The TSMC and NVIDIA Lock-Up

The conversation then shifted to the semiconductor supply chain itself, which Patel argues is becoming the binding constraint on AI scaling — more than power, more than data centers.

NVIDIA has secured a dominant position in TSMC's three-nanometer capacity, projected to control over 70% of N3 wafer production by 2027. How did this happen? The same dynamic that played out with compute contracts: NVIDIA committed early and aggressively, while Google and Amazon were slower to signal demand.

Patel revealed a telling sequence of events. In early Q3 of last year, Anthropic's compute team — staffed by former Google engineers — spotted a dislocation in TPU availability and negotiated a deal for over a million TPUs before Google's own leadership realized the opportunity. Google then had to go to TSMC and explain a sudden increase in capacity requests.

"Anthropic saw it before Google. And then Google had NanoBanano and Gemini 3, which caused their user metrics to skyrocket, and leadership at Google was like, 'Oh.'"

Since then, Google has gotten "absurdly AGI-pilled" — buying an energy company, putting deposits on turbines, acquiring powered land at massive scale. But the damage was done. TSMC's capacity for 2026 was largely spoken for, and meaningful expansion wouldn't come until 2027.

Bottleneck #2: Memory — The Crunch That's Killing Smartphones

If logic wafers are the first bottleneck, memory is the second — and its effects are already rippling through the consumer electronics industry in dramatic fashion.

The core problem is that HBM (High Bandwidth Memory), which AI accelerators require, uses three to four times more wafer area per bit than commodity DRAM. Every byte of HBM produced for AI is effectively destroying four bytes of capacity that could have gone to phones and laptops.

Patel's numbers on the consumer impact are striking:

"Our projections are we maybe get down to like 800 million this year. And next year, like 600 or 500 million... Xiaomi and OPPO are cutting low-end and mid-range smartphone volumes by half."

From 1.4 billion smartphones sold annually at peak to potentially 500 million — that's a civilizational shift in consumer electronics driven entirely by the reallocation of memory production to AI. The iPhone might see a $150-250 price increase as DRAM costs triple, but Apple's volumes will hold relatively steady. It's the low-end and mid-range phones — the ones bought by billions of people in developing countries — that get decimated.

Why can't we just build more memory? Because memory manufacturers spent 2022-2023 losing money and didn't build new fabs. Now that demand has exploded, those fabs take two years to construct. The industry is scrambling — Micron bought a fab from a Taiwanese company making lagging-edge chips, and Hynix and Samsung are doing "pretty crazy things" to expand existing facilities — but meaningful new capacity won't arrive until late 2027 or 2028.

Dwarkesh Patel raised an interesting question: could AI accelerators simply use commodity DRAM instead of HBM? Patel's answer was illuminating. The bandwidth difference is roughly an order of magnitude — HBM4 delivers about 2.5 terabytes per second per stack, while DDR5 in the same physical footprint delivers around 128 gigabytes per second. Switch to DDR, and you might have four times the memory capacity, but all your expensive compute sits idle waiting for data.

"The metric that you actually care about is bandwidth per wafer, not bits per wafer."

Bottleneck #3: EUV Tools — The Ultimate Chokepoint

As the conversation progressed, Patel built toward what he considers the ultimate constraint on AI scaling: ASML's EUV lithography tools.

The math is remarkably simple once you understand it. To build one gigawatt of AI data center capacity using NVIDIA's Rubin chips requires about 55,000 three-nanometer wafers, 6,000 five-nanometer wafers, and 170,000 DRAM wafers. Each wafer requires multiple EUV passes. Total it up, and you need roughly 3.5 EUV tools per gigawatt.

"So three and a half EUV tools satisfies a gigawatt. So it's funny to think about the numbers, right? Because we're talking about, oh, what's a gigawatt cost? It costs like $50 billion roughly, right? Whereas, what does three and a half EUV tools cost? That's like 1.2, right?"

A $100 billion AI ecosystem hanging on $1.2 billion worth of tooling that simply cannot be manufactured fast enough.

ASML currently produces about 70 EUV tools per year. Under aggressive expansion scenarios, they might reach just over 100 by the end of the decade. The installed base will grow to roughly 700 tools. At 3.5 tools per gigawatt, that's a theoretical maximum of about 200 gigawatts — enough for Sam Altman's stated goal of 52 gigawatts per year (representing about 25% market share), but nowhere near enough if every major player gets what they want.

Patel provided a mesmerizing description of why ASML can't simply scale production. The EUV tool has four major subsystems, each a marvel of engineering:

The source (made by Cymer in San Diego): drops tin droplets, hits them with a laser three times in precise sequence, generating EUV light at 13.5 nanometers
The projection optics (made by Carl Zeiss in Germany): 18 multilayer mirrors per tool, each with perfect layers of molybdenum and ruthenium, polished to sub-nanometer accuracy
The reticle stage (made in Wilmington, Connecticut): moves at nine Gs, carrying the chip design pattern
The wafer stage: moves in the opposite direction at similar accelerations, with sub-nanometer positioning accuracy

"Each of these things is, like, a wonder and marvel of, like, chemistry, fabrication, you know, mechanical engineering, optical engineering because you have to align all these things and make sure they're perfect."

ASML's supply chain includes over 10,000 individual suppliers. Carl Zeiss probably employs fewer than 1,000 people working on the optics. These are deeply specialized roles that can't be filled by training random people. And critically, ASML hasn't tried to massively expand because, like most of the semiconductor supply chain, they're simply not "AGI-pilled" enough to believe demand will reach the levels the AI labs are projecting.

Power: The Bottleneck That Isn't

In contrast to the semiconductor constraints, Patel is notably optimistic about power. While it was the binding constraint last year, he argues it's fundamentally solvable because the supply chains are simpler and more diverse.

Beyond the three major combined-cycle gas turbine manufacturers, there are aeroderivative turbines (essentially repurposed airplane engines), medium-speed reciprocating engines from companies like Cummins, ship engines, fuel cells from Bloom Energy, and solar-plus-battery systems. Patel's firm tracks 16 different power generation vendors and sees hundreds of gigawatts of orders going to data centers.

Perhaps most intriguingly, Patel argues that 20% of the existing US grid could be unlocked for data centers simply by adding enough peaker plants and utility-scale batteries to handle the handful of peak demand days per year that currently require all that idle capacity.

"The US grid is terawatt level, not hundreds of gigawatts level, right? So, we can add a lot more energy. It's not easy. I'm not saying it's easy... But the supply chains are just way more simple than chips."

This is why Patel is skeptical of space data centers in this decade. Energy is only about 10-15% of total cluster cost. Even if you doubled the price of power, an H100's hourly cost goes from $1.40 to $1.50 — trivial compared to the value of the tokens it produces. The real constraint is chips, and putting them in space adds months of deployment delay in a world where every moment of compute is precious.

The Geopolitical Dimension: Fast Timelines Favor the West

The conversation closed with a fascinating geopolitical analysis. China is working to indigenize its semiconductor supply chain, and Patel believes they'll have fully indigenous DUV capability by 2030 and working (though not mass-production) EUV tools. The question is whether this matters.

Patel's framework is elegant: fast AI timelines favor the West; slow timelines favor China. If AI capabilities continue accelerating — with Anthropic and OpenAI scaling to 10+ gigawatts by end of next year, generating tens of billions in revenue, and reinvesting in ever-more-capable models — the compounding advantages become insurmountable. China hasn't built the infrastructure to deploy AI at comparable scale.

"I don't think you have to believe in AGI to have the timelines where the US wins."

But if progress slows, if the massive capex investments yield middling returns, China's ability to build a fully vertical, indigenized supply chain — potentially with a million-wafer-per-month fab — could eventually give it the edge.

One remarkable data point: Huawei, had it not been banned from TSMC in 2019, might today be producing better AI accelerators than NVIDIA. They had the first seven-nanometer AI chip, world-class networking technology, their own fabs, and access to China's deep talent pool.

The View From 30,000 Feet

What makes this conversation so valuable is that it connects the abstract — trillion-dollar capex numbers, AGI timelines, geopolitical competition — to the brutally concrete: the number of EUV passes needed per gigawatt, the overlay accuracy required in nanometers, the bandwidth difference between HBM and DDR in terabytes per second.

The picture that emerges is one where the AI revolution is real, the money is flowing, and the demand is genuine — but the physical world imposes hard constraints that no amount of capital can instantly overcome. You can't snap your fingers and produce more EUV tools. You can't build a fab in less than two years. You can't train up Carl Zeiss mirror-polishers overnight.

The race to scale AI compute is, ultimately, a race against the physics of manufacturing, the inertia of supply chains, and the conservatism of industries that have survived too many boom-bust cycles to believe that this time is truly different. The companies and countries that recognized the scale of demand earliest — and committed to securing capacity before others — will hold decisive advantages for years to come. Everyone else is scrambling for what's left.