
DeepMind Gemini 3 Lead: What Comes After "Infinite Data"
18/12/2025 | 54 mins.
Gemini 3 was a landmark frontier model launch in AI this year ā but the story behind its performance isnāt just about adding more compute. In this episode, I sit down with Sebastian Bourgeaud, a pre-training lead for Gemini 3 at Google DeepMind and co-author of the seminal RETRO paper. In his first-ever podcast interview, Sebastian takes us inside the lab mindset behind Googleās most powerful model ā what actually changed, and why the real work today is no longer ātraining a model,ā but building a full system.We unpack the āsecret recipeā idea ā the notion that big leaps come from better pre-training and better post-training ā and use it to explore a deeper shift in the industry: moving from an āinfinite dataā era to a data-limited regime, where curation, proxies, and measurement matter as much as web-scale volume. Sebastian explains why scaling laws arenāt dead, but evolving, why evals have become one of the hardest and most underrated problems (including benchmark contamination), and why frontier research is increasingly a full-stack discipline that spans data, infrastructure, and engineering as much as algorithms.From the intuition behind Deep Think, to the rise (and risks) of synthetic data loops, to the future of long-context and retrieval, this is a technical deep dive into the physics of frontier AI. We also get into continual learning ā what it would take for models to keep updating with new knowledge over time, whether via tools, expanding context, or new training paradigms ā and what that implies for where foundation models are headed next. If you want a grounded view of pre-training in late 2025 beyond the marketing layer, this conversation is a blueprint.Google DeepMindWebsite - https://deepmind.googleX/Twitter -Ā https://x.com/GoogleDeepMindSebastian BorgeaudLinkedIn - https://www.linkedin.com/in/sebastian-borgeaud-8648a5aa/X/Twitter - https://x.com/borgeaud_sFIRSTMARKWebsite -Ā https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn -Ā https://www.linkedin.com/in/turck/X/Twitter -Ā https://twitter.com/mattturck(00:00) ā Cold intro: āWeāre ahead of scheduleā + AI is now a system(00:58) ā Oriolās āsecret recipeā: better pre- + post-training(02:09) ā Why AI progress still isnāt slowing down(03:04) ā Are models actually getting smarter?(04:36) ā Twoāthree years out: what changes first?(06:34) ā AI doing AI research: faster, not automated(07:45) ā Frontier labs: same playbook or different bets?(10:19) ā Post-transformers: will a disruption happen?(10:51) ā DeepMindās advantage: research Ć engineering Ć infra(12:26) ā What a Gemini 3 pre-training lead actually does(13:59) ā From Europe to Cambridge to DeepMind(18:06) ā Why he left RL for real-world data(20:05) ā From Gopher to Chinchilla to RETRO (and why it matters)(20:28) ā āResearch tasteā: integrate or slow everyone down(23:00) ā Fixes vs moonshots: how they balance the pipeline(24:37) ā Research vs product pressure (and org structure)(26:24) ā Gemini 3 under the hood: MoE in plain English(28:30) ā Native multimodality: the hidden costs(30:03) ā Scaling laws arenāt dead (but scale isnāt everything)(33:07) ā Synthetic data: powerful, dangerous(35:00) ā Reasoning traces: what he canāt say (and why)(37:18) ā Long context + attention: whatās next(38:40) ā Retrieval vs RAG vs long context(41:49) ā The real boss fight: evals (and contamination)(42:28) ā Alignment: pre-training vs post-training(43:32) ā Deep Think + agents + āvibe codingā(46:34) ā Continual learning: updating models over time(49:35) ā Advice for researchers + founders(53:35) ā āNo end in sightā for progress + closing

Whatās Next for AI? OpenAIās Åukasz Kaiser (Transformer Co-Author)
26/11/2025 | 1h 5 mins.
Weāre told that AI progress is slowing down, that pre-training has hit a wall, that scaling laws are running out of road. Yet weāre releasing this episode in the middle of a wild couple of weeks that saw GPT-5.1, GPT-5.1 Codex Max, fresh reasoning modes and long-running agents ship from OpenAI ā on top of a flood of new frontier models elsewhere. To make sense of whatās actually happening at the edge of the field, I sat down with someone who has literally helped define both of the major AI paradigms of our time.Åukasz Kaiser is one of the co-authors of āAttention Is All You Need,ā the paper that introduced the Transformer architecture behind modern LLMs, and is now a leading research scientist at OpenAI working on reasoning models like those behind GPT-5.1. In this conversation, he explains why AI progress still looks like a smooth exponential curve from inside the labs, why pre-training is very much alive even as reinforcement-learning-based reasoning models take over the spotlight, how chain-of-thought actually works under the hood, and what it really means to ātrain the thinking processā with RL on verifiable domains like math, code and science. We talk about the messy reality of low-hanging fruit in engineering and data, the economics of GPUs and distillation, interpretability work on circuits and sparsity, and why the best frontier models can still be stumped by a logic puzzle from his five-year-oldās math book.We also go deep into Åukaszās personal journey ā from logic and games in Poland and France, to Ray Kurzweilās team, Google Brain and the inside story of the Transformer, to joining OpenAI and helping drive the shift from chatbots to genuine reasoning engines. Along the way we cover GPT-4 ā GPT-5 ā GPT-5.1, post-training and tone, GPT-5.1 Codex Max and long-running coding agents with compaction, alternative architectures beyond Transformers, whether foundation models will āeatā most agents and applications, what the translation industry can teach us about trust and human-in-the-loop, and why he thinks generalization, multimodal reasoning and robots in the home are where some of the most interesting challenges still lie.OpenAIWebsite - https://openai.comX/Twitter -Ā https://x.com/OpenAIÅukasz KaiserLinkedIn - https://www.linkedin.com/in/lukaszkaiser/X/Twitter - https://x.com/lukaszkaiserFIRSTMARKWebsite -Ā https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn -Ā https://www.linkedin.com/in/turck/X/Twitter -Ā https://twitter.com/mattturck(00:00) ā Cold open and intro(01:29) ā āAI slowdownā vs a wild week of new frontier models(08:03) ā Low-hanging fruit: infra, RL training and better data(11:39) ā What is a reasoning model, in plain language?(17:02) ā Chain-of-thought and training the thinking process with RL(21:39) ā Åukaszās path: from logic and France to Google and Kurzweil(24:20) ā Inside the Transformer story and what āattentionā really means(28:42) ā From Google Brain to OpenAI: culture, scale and GPUs(32:49) ā Whatās next for pre-training, GPUs and distillation(37:29) ā Can we still understand these models? Circuits, sparsity and black boxes(39:42) ā GPT-4 ā GPT-5 ā GPT-5.1: what actually changed(42:40) ā Post-training, safety and teaching GPT-5.1 different tones(46:16) ā How long should GPT-5.1 think? Reasoning tokens and jagged abilities(47:43) ā The five-year-oldās dot puzzle that still breaks frontier models(52:22) ā Generalization, child-like learning and whether reasoning is enough(53:48) ā Beyond Transformers: ARC, LeCunās ideas and multimodal bottlenecks(56:10) ā GPT-5.1 Codex Max, long-running agents and compaction(1:00:06) ā Will foundation models eat most apps? The translation analogy and trust(1:02:34) ā What still needs to be solved, and where AI might go next

Open Source AI Strikes Back ā Inside Ai2ās OLMo 3 āThinking"
20/11/2025 | 1h 28 mins.
In this special release episode, Matt sits down with Nathan Lambert and Luca Soldaini from Ai2 (the Allen Institute for AI) to break down one of the biggest open-source AI drops of the year: OLMo 3. At a moment when most labs are offering āopen weightsā and calling it a day, AI2 is doing the opposite ā publishing the models, the data, the recipes, and every intermediate checkpoint that shows how the system was built. Itās an unusually transparent look into the inner machinery of a modern frontier-class model.Nathan and Luca walk us through the full pipeline ā from pre-training and mid-training to long-context extension, SFT, preference tuning, and RLVR. They also explain what a thinking model actually is, why reasoning models have exploded in 2025, and how distillation from DeepSeek and Qwen reasoning models works in practice. If youāve been trying to truly understand the āRL + reasoningā era of LLMs, this is the clearest explanation youāll hear.We widen the lens to the global picture: why Metaās retreat from open source created a āvacuum of influence,ā how Chinese labs like Qwen, DeepSeek, Kimi, and Moonshot surged into that gap, and why so many U.S. companies are quietly building on Chinese open models today. Nathan and Luca offer a grounded, insider view of whether America can mount an effective open-source response ā and what that response needs to look like.Finally, we talk about where AI is actually heading. Not the hype, not the doom ā but the messy engineering reality behind modern model training, the complexity tax that slows progress, and why the transformation between now and 2030 may be dramatic without ever delivering a single āAGI moment.ā If you care about the future of open models and the global AI landscape, this is an essential conversation.Allen Institute for AI (AI2)Website - https://allenai.orgX/Twitter -Ā https://x.com/allen_aiNathan LambertBlog - https://www.interconnects.aiLinkedIn - https://www.linkedin.com/in/natolambert/X/Twitter - https://x.com/natolambertLuca SoldainiBlog - https://soldaini.netLinkedIn - https://www.linkedin.com/in/soldni/X/Twitter - https://x.com/soldniFIRSTMARKWebsite -Ā https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn -Ā https://www.linkedin.com/in/turck/X/Twitter -Ā https://twitter.com/mattturck(00:00) ā Cold Open(00:39) ā Welcome & todayās big announcement(01:18) ā Introducing the Olmo 3 model family(02:07) ā What ābase modelsā really are (and why they matter)(05:51) ā Dolma 3: the data behind Olmo 3(08:06) ā Performance vs Qwen, Gemma, DeepSeek(10:28) ā What true open source means (and why itās rare)(12:51) ā Intermediate checkpoints, transparency, and why AI2 publishes everything(16:37) ā Why Qwen is everywhere (including U.S. startups)(18:31) ā Why Chinese labs go open source (and why U.S. labs donāt)(20:28) ā Inside ATOM: the U.S. response to Chinaās model surge(22:13) ā The rise of āthinking modelsā and inference-time scaling(35:58) ā The full Olmo pipeline, explained simply(46:52) ā Pre-training: data, scale, and avoiding catastrophic spikes(50:27) ā Mid-training (tail patching) and avoiding test leakage(52:06) ā Why long-context training matters(55:28) ā SFT: building the foundation for reasoning(1:04:53) ā Preference tuning & why DPO still works(1:10:51) ā The hard part: RLVR, long reasoning chains, and infrastructure pain(1:13:59) ā Why RL is so technically brutal(1:18:17) ā Complexity tax vs AGI hype(1:21:58) ā How everyone can contribute to the future of AI(1:27:26) ā Closing thoughts

Intelligence Isnāt Enough: Why Energy & Compute Decide the AGI Race ā Eiso Kant
06/11/2025 | 1h 6 mins.
Frontier AI is colliding with real-world infrastructure. Eiso Kant (Co-CEO & Co-Founder, Poolside) joins the MAD Podcast to unpack Project Horizonā a multi-gigawatt West Texas buildāand why frontier labs must own energy, compute, and intelligence to compete. We map token economics, cloud-style margins, and the staged 250 MW rollout using 2.5 MW modular skids.Then we get operational: the CoreWeave anchor partnership, environmental choices (SCR, renewables + gas + batteries), community impact, and how Poolside plans to bring capacity online quickly without renting away margināplus the enterprise motion (defense to Fortune 500) powered by forward deployed research engineers.Finally, we go deep on training. Eiso lays out RL2L (Reinforcement Learning to Learn)ā aimed at reverse-engineering the webās thoughts and actionsā why intelligence may commoditize, what that means for agents, and how coding served as a proxy for long-horizon reasoning before expanding to broader knowledge work.PoolsideWebsite - https://poolside.aiX/Twitter -Ā https://x.com/poolsideaiEiso KantLinkedIn - https://www.linkedin.com/in/eisokant/X/Twitter - https://x.com/eisokantFIRSTMARKWebsite -Ā https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://www.mattturck.comLinkedIn -Ā https://www.linkedin.com/in/turck/X/Twitter -Ā https://twitter.com/mattturck(00:00) Cold open ā āIntelligence becomes a commodityā(00:23) Host intro ā Project Horizon & RL2L(01:19) Why Poolside exists amid frontier labs(04:38) Project Horizon: building one of the largest US data center campuses(07:20) Why own infra: scale, cost, and avoiding ācosplayā(10:06) Economics deep dive: $8B for 250 MW, capex/opex, margins(16:47) CoreWeave partnership: anchor tenant + flexible scaling(18:24) Hiring the right tail: building a physical infra org(30:31) RL today ā agentic RL and long-horizon tasks(37:23) RL2L revealed: reverse-engineering the webās thoughts & actions(39:32) Continuous learning and the āhot stoveā limitation(43:30) Agents debate: thin wrappers, differentiation, and model collapse(49:10) āIs AI plateauing?āāchip cycles, scale limits, and new axes(53:49) Why software was the proxy; expanding to enterprise knowledge work(55:17) Model status: Malibu ā Laguna (small/medium/large)(57:31) Poolside's Commercial Reality today: defense; Fortune 500; FDRE (1:02:43) Global team, avoiding the echo chamber(1:04:34) Next 12ā18 months: frontier models + infra scale(1:05:52) Closing

State of AI 2025 with Nathan Benaich: Power Deals, Reasoning Breakthroughs, Real Revenue
30/10/2025 | 1h 3 mins.
Power is the new bottleneck, reasoning got real, and the business finally caught up. In this wide-ranging conversation, I sit down with Nathan Benaich, Founder and General Partner at Air Street Capital, to discuss the newly published 2025 State of AI reportāwhatās actually working, whatās hype, and where the next edge will come from. We start at the physical layer: energy procurement, PPAs, off-grid builds, and why water and grid constraints are turning powerānot GPUsāinto the decisive moat.From there, we move into capability: reasoning models acting as AI co-scientists in verifiable domains, and the āchain-of-actionā shift in robotics thatās taking us from polished demos to dependable deployments. Along the way, we examine the market realityāwhoās making real revenue, how margins actually behave once tokens and inference meet pricing, and what all of this means for builders and investors.We also zoom out to the ecosystem: NVIDIAās position vs. custom silicon, Chinaās split stack, and the rise of sovereign AI (and the āsovereignty washingā that comes with it). The policy and security picture gets a hard look tooāregulationās vibe shift, data-rights realpolitik, and what agents and MCP mean for cyber risk and adoption.Nathan closes with where heās placing bets (bio, defense, robotics, voice) and three predictions for the next 12 months. Nathan BenaichBlog - https://www.nathanbenaich.comX/Twitter - https://x.com/nathanbenaichSource: State of AI Report 2025 (9/10/2025)Air Street CapitalWebsite - https://www.airstreet.comX/Twitter -Ā https://x.com/airstreetMatt Turck (Managing Director)Blog - https://www.mattturck.comLinkedIn -Ā https://www.linkedin.com/in/turck/X/Twitter -Ā https://twitter.com/mattturckFIRSTMARKWebsite -Ā https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(0:00) ā Cold Open: āGargantuan money, real reasoningā(0:40) ā Intro: State of AI 2025 with Nathan Benaich(02:06) ā Reasoning got real: from chain-of-thought to verified math wins(04:11) ā AI co-scientist: hypotheses, wet-lab validation, fewer ādumb stochastic parrotsā (04:44) ā Chain-of-action robotics: plan ā act you can audit(05:13) ā Humanoids vs. warehouse reality: where robots actually stick first(06:32) ā The business caught up: whoās making real revenue now(08:26) ā Adoption & spend: Ramp stats, retention, and the shadow-AI gap(11:00) ā Margins debate: tokens, pricing, and the thin-wrapper trap(14:02) ā Bubble or boom? Wall Street vs. SF vibes (and circular deals)(19:54) ā Power is the bottleneck: $50B/GW capex and the new moat(21:02) ā PPAs, gas turbines, and off-grid builds: the procurement game(23:54) ā Water, grids, and NIMBY: sustainability gets political(25:08) ā NVIDIAās moat: 90% of papers, Broadcom/AMD, and custom silicon(28:47) ā China split-stack: Huawei, Cambricon, and export zigzags(30:30) ā Sovereign AI or āsovereignty washingā? Open source as leverage(40:40) ā Regulation & safety: from Bletchley to āAI Actionāāthe vibe shift(44:06) ā Safety budgets vs. lab spend; models that game evals(44:46) ā Data rights realpolitik: $1.5B signals the new training cost(47:04) ā Cyber risk in the agent era: MCP, malware LMs, state actors(50:19) ā Agents that convert: search ā commerce and the demo flywheel(54:18) ā VC lens: where Nathan is investing (bio, defense, robotics, voice)(68:29) ā Predictions: power politics, AI neutrality, end-to-end discoveries(1:02:13) ā Wrap: what to watch next & where to find the report (stateof.ai)



The MAD Podcast with Matt Turck