20VC with Harry Stebbings.
Cerebris and the AI Chip Revolution

aired [03.24.2025]

Host: Harry Stebbings

Guest: Andrew Feldman.

Key Insights

  • Cerebris’s Origin: Founded in 2015 to address AI’s unique computational demands, Cerebris leverages wafer-scale integration and SRAM to optimize data movement, outpacing traditional GPUs in inference tasks.

  • Training vs. Inference: GPUs excel in training but falter in inference due to off-chip memory inefficiencies; Cerebris targets both, with a focus on inference’s growing market.

  • Memory Innovation: Using SRAM over HBM, Cerebris achieves faster, more efficient inference, solving capacity issues with a groundbreaking wafer-scale approach.

  • Market Dynamics: NVIDIA dominates now, but its share may drop to 50-60% in five years as challengers like Cerebris exploit inference weaknesses and customer delays.

  • AI’s Future: Expect AI to rival cell phone ubiquity within years, driven by faster, cheaper hardware and algorithms, unlocking new applications and societal benefits.
1.Cerebris’s Big Bet: Redefining AI Hardware

In 2015, Cerebris’s founders spotted a gap in traditional processors’ ability to handle AI’s data-intensive workloads, sparking the company’s creation. Unlike GPUs, optimized for graphics, Cerebris built a chip prioritizing data movement over raw computation, using wafer-scale integration for efficiency.
The guest underestimated the AI market’s scale—his fifth startup, yet the first time he misjudged growth this vastly.
•Quote: Andrew Feldman “We saw the rise of a new workload, and this is every computer architect’s dream.”
“We saw the rise of a new workload, and this is every computer architect’s dream.”
2. GPUs Under Fire: The Inference Challenge

  • GPUs, built for training with off-chip HBM memory, hit bottlenecks in inference, where rapid data access trumps computation volume.
  • Cerebris’s wafer-scale design with SRAM slashes power use and speeds up inference, outperforming GPUs in benchmarks since August 26th launch.
  • NVIDIA’s dominance (near 100% now) faces threats as its architecture lags in inference efficiency, opening doors for competitors.
Quote: Andrew Feldman “And if you have an architecture like we saw in the GPU, that is your fundamental limitation. It's a fundamental architectural limitation.”
“And if you have an architecture like we saw in the GPU, that is your fundamental limitation. It's a fundamental architectural limitation.”
3.SRAM vs. HBM: A Memory Revolution

  • SRAM’s speed beats HBM’s high-capacity but slow access, critical for inference where data moves constantly—e.g., 140GB per word in a 70B parameter model.
  • Wafer-scale integration solves SRAM’s capacity limits, reducing chip count from thousands to a handful, cutting complexity and power.
  • Current AI algorithms waste 93-95% of GPU capacity during inference, signaling vast room for hardware-algorithm synergy.
Quote: Andrew Feldman “In a GPU, most of the time it’s doing inference, it’s 5 or 7% utilized. That means it’s 95 or 93% wasted.”
“In a GPU, most of the time it’s doing inference, it’s 5 or 7% utilized. That means it’s 95 or 93% wasted.”
4.NVIDIA’s Moat and the Coming Shift

  • NVIDIA’s market share may shrink from nearly 100% to 50-60% in five years as inference-focused challengers like Cerebris gain traction.
  • CUDA’s lock-in is negligible in inference—users switch platforms easily—unlike training, where NVIDIA excels.
  • Delivery delays could sour NVIDIA’s customers, offering Cerebris a chance to capitalize with faster, available alternatives.
Quote: Andrew Feldman “I think in five years from now, NVIDIA is going to have 60. Somewhere between 50% and 60% of the market.”
Andrew Feldman “I think in five years from now, NVIDIA is going to have 60. Somewhere between 50% and 60% of the market.”
5.AI Everywhere: The Next Ubiquity

  • AI’s penetration could match cell phones in 1-2 years as hardware gets faster and cheaper, echoing transitions from DVDs to streaming.
  • Inference demand will soar—more users, frequent use, and heavier compute per use—driving a market over 100x larger in five years.
  • Future apps, powered by firms like Cerebris, may solve societal issues (e.g., disease cures), becoming invisible yet essential.
Quote: Andrew Feldman “I think within a year or two, AI’s penetration will be approximately the same as telephones, cell phones.”
“I think within a year or two, AI’s penetration will be approximately the same as telephones, cell phones.”
subscribe to receive weekly 5 minute summaries.