NVIDIA's CEO handed Stanford students the playbook for the next 10 years

NVIDIA CEO Jensen Huang told Stanford students that his company achieved a 1,000,000x improvement in computing performance over the past decade through co-design — simultaneously optimizing hardware, software, and networking — while Moore's Law would have delivered only 10x. Huang argued that this million-fold acceleration, not a single algorithmic breakthrough, enabled modern AI by making it feasible for researchers to train on the entire internet instead of curated datasets. The executive warned that the people spreading fear about AI misunderstand the technology, and he outlined a design philosophy of vertical integration that he said will define the next decade for founders, operators, and investors.

NVIDIA's CEO handed Stanford students the playbook for the next 10 years Jensen Huang on design philosophy, energy math, agent architecture, and why the people scaring you about AI are wrong There is a number Jensen Huang keeps using, and most people hear it wrong. 1,000,000x. That is how much faster NVIDIA made computing over the last 10 years, against the 10x that Moore’s Law https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi would have handed you. That gap is the reason AI https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi exists at all. It is why researchers trained on the entire internet instead of curating datasets, and why everything is getting disrupted now instead of in 2040. I watched the full Stanford CS153 lecture so you do not have to. Here are the 10 things that matter for any founder, operator, or investor https://www.the-ai-corner.com/t/business-and-investing?r=1krivi paying attention. Here are the 10 things that matter 👇 Together with Attio: Jensen’s whole point is that agents stop demoing and start doing the work. Attio is where that already happens to your revenue. It is the AI CRM https://attio.com/?utm source=ai corner&utm medium=newsletter sponsorship&utm campaign=ai corner-Y26 that runs around the clock, turning every signal into one living view of every account. Here is what that looks like inside your pipeline: ▫️ that research, qualify, and move each deal forward while you sleep Agents on every account https://attio.com/?utm source=ai corner&utm medium=newsletter sponsorship&utm campaign=ai corner-Y26 ▫️ of every account, built from emails, meetings, and agent activity as it happens One live picture https://attio.com/?utm source=ai corner&utm medium=newsletter sponsorship&utm campaign=ai corner-Y26 ▫️ about your business and get instant answers and actions from one chat thread Ask Attio anything https://attio.com/?utm source=ai corner&utm medium=newsletter sponsorship&utm campaign=ai corner-Y26 It is the CRM for the new way of going to market. Join the 30,000+ teams already on it. 1. The 1,000,000x number holds, and the reason behind it is the only strategy lesson you need this decade Moore’s Law is over. It has been for a decade. “In the case of NVIDIA and co-design, we got 1,000,000x over 10 years. Somewhere between 100,000x and 1,000,000x. When you’re talking about numbers that big, it really doesn’t matter.” Dennard scaling collapsed around 2015, and without it Moore’s Law compounds to maybe 10x over a decade. Semiconductor physics ran out of road, and NVIDIA delivered the full million anyway. The mechanism is co-design: optimizing CPUs, GPUs, networking, switches, storage, software, and compilers https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi at the same time, against the same objective, instead of leaving each layer to its own team. It is the same vertical integration https://www.thevccorner.com/p/elon-musk-xai-spacex-vertical-integration?r=1krivi logic that turns a hardware company into a compounding machine. Jensen made it concrete with RISC. John Hennessy at Stanford proved that a simpler instruction set co-designed with a compiler beats two separately optimized systems every time, because the whole outperforms the sum of the parts. Scale that across an entire compute stack and a million-x follows. The downstream effect is not a faster computer, it is a different category of possibility. When compute gets a million times faster, researchers stop asking which data to use and reach for all of it, the entire internet. That abundance is what unlocked modern AI https://www.the-ai-corner.com/t/claude-and-anthropic?r=1krivi , not a breakthrough algorithm but a design philosophy. Try this → before your next architecture decision https://www.the-ai-corner.com/p/founder-mental-models-ai-agent-claude-chatgpt-openclaw-2026?r=1krivi , ask whether you are optimizing a layer or co-designing across layers. The answer predicts your ceiling. 2. NVIDIA built a multi-billion-dollar system with zero potential customers, and first principles is how you do that without going insane The most expensive supercomputer ever sold cost $350 million. “You would have precisely zero customers. The reason for that is because the most expensive thing that has ever been sold was $350 million. And you’re building something that’s multiple billions of dollars. So you’re building for a precisely marketplace of zero.” Jensen did not survey the market, he reasoned through the problem. Pre-training was going to be enormous, the systems to run it at scale would cost more than anything ever built, and no customer existed yet, so they built it anyway. Hopper was designed for that bet, and the architecture landed exactly when the demand arrived. The lesson is narrower than “be contrarian.” Market research tells you what customers want today. First-principles reasoning https://www.the-ai-corner.com/p/founder-mental-models-ai-agent-claude-chatgpt-openclaw-2026?r=1krivi tells you what they will need to want for the world to work the way the evidence says it is heading. One builds the future, the other optimizes the present, and the best founders https://www.thevccorner.com/p/what-top-vcs-look-for-2026-founder-playbook?r=1krivi know which question they are answering. Try this → take your most important product bet and ask whether it came from customer interviews or first-principles reasoning. If you cannot tell the difference, that is the problem to solve first. 3. Inference is where the money gets made, and the bottleneck is not what most engineering teams optimize for Training builds intelligence, inference https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi delivers it, and the entire business of AI https://www.the-ai-corner.com/t/business-and-investing?r=1krivi runs on inference. “The speed up over the previous generation: 50 times. In two years, we improved something by 50 times. Moore’s law would have improved it by 2x.” Here is the constraint most engineers miss. Generating tokens https://www.the-ai-corner.com/t/prompting-and-context-engineering?r=1krivi is bandwidth-constrained, not compute-constrained. The pre-fill phase processes context, the decode phase generates every token, and decode dominates because it needs more memory bandwidth than a single chip can provide. NVIDIA’s answer was Grace Blackwell NVLink72 https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi : 72 chips ganged into the world’s first rack-scale computer, solving for the actual bottleneck and landing a 50x gain in two years. The co-design philosophy https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi compounds, because a million-x over a decade is 50x here and 50x there, stacking. Teams optimizing MFU, model flops utilization, are measuring the wrong thing. Jensen would rather run low MFU and be over-provisioned than hit 100% and fight Amdahl’s Law on every workload. Optimize for the constraint that limits you, not the metric that is easy to track. Try this → name one metric your team optimizes that does not map to the constraint actually limiting your output, and retire it. 4. Stanford has a $40 billion endowment and no supercomputer, and Jensen says that is Stanford's fault There is no chip shortage, there is a budget structure problem. There is no chip shortage, there is a budget structure problem. “It is not true that people are giving me orders, placing orders, and we’re not delivering chips. It is just not true. You’ve got to place orders.” Every research department at Stanford raises its own grants, no grant is large enough to justify shared compute infrastructure https://www.the-ai-corner.com/t/ai-agents?r=1krivi , nobody pools resources, and the result is a campus of laptops and zero supercomputers. Jensen’s fix is blunt. Stanford has $40 billion, so cut $1 billion, contract a cloud provider https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi , and give every student access to AI supercomputers this year. The chips exist, the money exists, the structure is the problem. His accountability framing is the sharpest insight here. When you tell someone it is not their fault, you take away their ability to fix it, because fault and agency are the same thing. Assign the fault correctly and you assign the power to solve it. He is not criticizing Stanford, he is refusing to let it off the hook. The same pattern sits inside every large company https://www.the-ai-corner.com/t/business-and-investing?r=1krivi : fragmented budgets, siloed compute, everyone underpowered. The organizations that aggregate first will produce the work that defines the next decade. Try this → map every compute and data resource your team uses, find what is fragmented that could be pooled, and treat the gap as a budget structure problem rather than a resource problem. 5. The last time computing changed this fundamentally, it was 1964, and that date tells you exactly what to audit This is a reinvention, not a technology cycle. “For the first time, the way you write the software, how you process the neural network versus the software, and what the applications can do has now dramatically changed. Everything is fundamentally different.” The IBM System/360 defined the computing model in 1964, and everything since PCs, internet, mobile, cloud was built on that foundation of prerecorded software, compiled binaries, and explicit instructions. The model held for 60 years. Neural networks https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi run differently than compiled code, and that one fact breaks every assumption the old model made: ▫️ Software is no longer written and compiled, it is trained https://www.the-ai-corner.com/t/prompting-and-context-engineering?r=1krivi ▫️ Output is generated as it happens, rather than retrieved from storage ▫️ The computer responds to intention, not only to instruction ▫️ Applications that needed human-level perception are now buildable Alpamayo, NVIDIA’s self-driving system, is the proof. Thirteen years of self-driving work, nothing good enough, then deep learning arrived and the entire application category unlocked. That is what a genuine reinvention looks like: a different category of possible, not a faster version of the old thing. Try this → audit your product’s core assumptions, find the ones shaped by the 1964 computing model, and ask whether they still hold. The ones that do not are either your biggest risk or your biggest opportunity. 6. The moment GPT shipped, agentic systems were obvious, and the founders who saw it built the infrastructure Generative AI did not make images. It made thinking visible. “Thinking is generating tokens that you consume internally. Generating tokens that you consume externally would be called tool use. And so the idea that after GPT happened two years ago, that we would be at this moment was fairly easy to predict.” When GPT shipped, Jensen did not see a chatbot, he saw that the mechanism generating text can generate thoughts. Internal token generation is reasoning, external token generation is tool use https://www.the-ai-corner.com/t/ai-agents?r=1krivi , and the agentic https://www.the-ai-corner.com/t/ai-agents?r=1krivi trajectory was a direct mechanical consequence, a derivation rather than a guess. The engineering between GPT and today was hard work by brilliant people: training models to reason step by step https://www.the-ai-corner.com/p/ai-agent-reliability-playbook?r=1krivi at scale, fine-tuning for reliability, building the tooling. The destination was legible in 2023 to anyone reasoning carefully about the mechanism. The current signals read just as plainly. Agentic systems https://www.the-ai-corner.com/t/ai-agents?r=1krivi are here, continuous compute is replacing on-demand compute, and everything built for on-demand will get rebuilt. The founders who saw it early are building the infrastructure https://www.thevccorner.com/p/ai-agent-startup-ideas-2025?r=1krivi everyone else now rents. Try this → name the next step in the trajectory from where AI is today, write it down, and build against it. If you cannot name it, that is the gap to close first. 7. Open models are not about the PR war with OpenAI, and the three reasons that matter more than the headline NVIDIA burns more Anthropic and OpenAI https://www.the-ai-corner.com/p/anthropic-30b-arr-passed-openai-revenue-2026?r=1krivi tokens than almost any company, and that is not a contradiction. “There are too many societies where the scale of their language is not big enough for somebody else to decide to make it a high priority. Unless you deeply care, it’s never going to be great.” Three reasons, ordered by how underappreciated they are. First, 230 languages where no major lab will prioritize fine-tuning. Nemotron, NVIDIA’s near-frontier open model https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi , exists so any researcher anywhere can fine-tune https://www.the-ai-corner.com/t/prompting-and-context-engineering?r=1krivi for any language without starting from scratch. That is infrastructure for human intelligence at global scale. Second, human priors. Alpamayo is a language model fused with a world model, and that fusion means the self-driving system needs a few million training miles rather than billions. The data requirement collapses when the model reasons from human experience instead of learning every edge case from zero. Third, you cannot secure a black box. A closed-model arms race in cybersecurity https://www.the-ai-corner.com/p/saas-defense-playbook-ai-era-survival-guide-2026?r=1krivi is just version numbers climbing while everyone stays exposed in between. Transparency is how you defend, and Nemotron Nano, deployed in swarms https://www.the-ai-corner.com/p/20-agent-ai-script-factory-10m-revenue?r=1krivi , already runs this way. Try this → pick one domain-specific model you build or use, and ask whether fusing it with a language model https://www.the-ai-corner.com/t/claude-and-anthropic?r=1krivi that carries human priors would cut your training data requirement. If yes, that is a compounding advantage sitting unused. 8. Vera Rubin is not a faster training chip, it is the first computer designed around how agents actually work Agents https://www.the-ai-corner.com/t/ai-agents?r=1krivi do not compute the way training clusters compute. “The AI is this multi-billion dollar system and it sends off an instruction to use a tool and that tool is gonna run on the CPU. Meanwhile, this GPU supercomputer, this multi-billion dollar system is waiting for this one CPU.” The agent compute pattern https://www.the-ai-corner.com/t/ai-agents?r=1krivi has three requirements current cloud infrastructure was not built for. Long-term memory https://www.the-ai-corner.com/p/context-engineering-guide-2026?r=1krivi has to live in storage wired directly to the processor fabric, because copying data off network storage kills latency. Tool use https://www.the-ai-corner.com/t/ai-agents?r=1krivi runs on CPUs, and cloud CPUs were designed for parallel throughput, 200 cores each doing independent work. Agent workloads need the opposite: single-threaded operations with extreme low latency, because a billion-dollar GPU cluster stalls waiting on one CPU thread. Storage and CPU architecture both get redesigned for this pattern, and Vera Rubin is that redesign. Feynman, the generation after, is likely built for swarms of agents running sub-agents https://www.the-ai-corner.com/p/five-agent-sales-team-build-weekend-2026?r=1krivi running their own sub-agents. Jensen names the method: identify the compute pattern, understand how it differs from the past, and build the system to match the actual workload. Try this → map the actual compute pattern of your most important agent workload https://www.the-ai-corner.com/p/ai-agent-reliability-playbook?r=1krivi , find where latency enters, and ask whether your infrastructure was designed for that pattern. The gap is your performance ceiling. 9. Energy for AI is probably 1,000x current levels, Jensen has done the math, and the market is starting to agree The bottleneck after chips is energy. This has been true for five years. Most people are still learning it. The bottleneck after chips is energy, and most people are still learning that. “The amount of energy that we need for computing is likely probably a thousand times more than we currently have. And so I think if you said we need a thousand times, I wouldn’t be surprised if we’re off by a couple of orders of magnitude.” The reasoning is structural. Old computing is on-demand and retrieval-based: a server sits idle until a request arrives, responds, and goes back to idle, consuming energy per query. New computing is generative and continuous: a model runs at all times, contextually aware, generating outputs before you ask. The energy profile https://www.the-ai-corner.com/t/business-and-investing?r=1krivi is a different category, not a 2x bump. Two levers sit inside NVIDIA’s control: ▫️ Tokens per watt, already improved 50x, compounding through every generation https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi ▫️ Co-design efficiency choices made from the chip level up The lever outside its control but within reach of the market is energy infrastructure. Solar and nuclear needed subsidies to pencil out five years ago, and the compute demand is now strong enough that the market funds them https://www.thevccorner.com/p/vcs-betting-on-ai-2025?r=1krivi without subsidies. Jensen calls this the best moment in history to invest in sustainable energy, which is a market-structure observation, because the demand curve justifies the capital https://www.thevccorner.com/t/investor-lists?sort=top . Track this → tokens per watt across your inference stack. The teams improving it fastest will hold a structural cost advantage within three years that is hard to close from behind. 10. The AI doom narrative is not just wrong, it is built to sound unfalsifiable, and Jensen is not being polite about it Jensen does not hedge here. “It is not true that we have no idea how these systems work. It is not true that the technology is going to somehow in some nanosecond become infinitely powerful and therefore it’s going to take over the world. It is not true. These things are all being made up.” The singularity-flash claim runs like this: at some unknown Wednesday at 7pm, AI becomes infinitely powerful, the game ends, no way to know when, no way to defend, some chance it ends civilization. His counter is specific rather than “trust us.” We understand how these systems work https://www.the-ai-corner.com/t/claude-and-anthropic?r=1krivi , capabilities scale predictably with investment and architecture decisions, defenses can be built and tested, and the trajectory is legible. None of that fits the singularity-flash story. The more grounded version of the safety conversation https://www.thevccorner.com/p/dario-amodei-safe-ai-agi-anthropic?r=1krivi looks nothing like the doom one. The harm is concrete. The students in that hall are deciding whether to enter computer science, and the narrative shapes that choice. A generation raised to believe AI is an unfathomable danger enters policy rather than engineering https://www.the-ai-corner.com/p/ai-engineer-roadmap-production-projects-2026?r=1krivi , advocates for restriction rather than building, and cedes the field to people who do not share that hesitation. GPUs https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi are general-purpose compute. A billion people own them, running medical imaging, logistics networks, climate models, and video games. The comparison to atomic bombs makes useful thinking impossible, because every policy conclusion that starts there is contaminated. Try this → the next time you read an AI safety claim, ask whether it is falsifiable. If no evidence could disprove it, it is a narrative, not a safety argument, so treat it accordingly. The stack changed, all of it. Here is what that means depending on where you sit. Jensen’s thesis is not about chips, it is about co-design as a principle applied at every scale. Optimize a layer and you get linear improvement. Co-design across layers and you get compounding. That holds for hardware, for product architecture, and for organizations https://www.the-ai-corner.com/t/business-and-investing?r=1krivi . Founders The compute pattern for agents https://www.the-ai-corner.com/t/ai-agents?r=1krivi differs from the pattern for training, so build for the actual one and know whether your infrastructure matches the workload. The founders who map this in the next 24 months will build the platforms everyone else runs on, and the startup ideas worth building right now https://www.thevccorner.com/p/ai-agent-startup-ideas-2025?r=1krivi are already public. Investors The energy constraint is structural, and the market is now large enough to fund the solution without subsidies. The compounding returns sit in efficient inference https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi , agent-optimized infrastructure, and the energy layer underneath all of it. Incumbency does not predict who wins, first-principles reasoning https://www.thevccorner.com/p/what-top-vcs-look-for-2026-founder-playbook?r=1krivi does, so find the teams doing it before the consensus catches up https://www.thevccorner.com/p/vcs-betting-on-ai-2025?r=1krivi . Start from the investor list of lists https://www.thevccorner.com/t/investor-lists?sort=top and where the venture money is moving https://www.thevccorner.com/p/q1-2026-us-fund-activity-record-fundraising?r=1krivi . Operators The computing model that held for 64 years just changed. The teams who internalize that as a first-principles fact will rebuild the stack correctly, the teams who treat it as an upgrade will spend the decade debugging assumptions that were never right for this environment. Start with how to use Claude like the top 1% of users https://www.the-ai-corner.com/p/chatgpt-claude-power-user-setup-guide-2026?r=1krivi and the complete guide to AI coding in 2026 https://www.the-ai-corner.com/p/ai-coding-tools-complete-guide-2026?r=1krivi . Anyone entering the field The door is wider open than it has been since 1964, every layer of the stack is being rethought, and the people doing the rethinking are the ones reasoning most sharply about what changed. The 2026 AI engineer roadmap https://www.the-ai-corner.com/p/ai-engineer-roadmap-production-projects-2026?r=1krivi and the Claude Architect certification curriculum https://www.the-ai-corner.com/p/claude-certified-architect-curriculum-2026?r=1krivi are where that reasoning starts. Three principles to carry forward ▫️ Co-design wins. Individual layer optimization https://www.the-ai-corner.com/t/ai-tools-and-models?r=1krivi has a ceiling, and integration removes it. ▫️ The compute pattern determines the architecture. Know the pattern https://www.the-ai-corner.com/t/ai-agents?r=1krivi before you write a line of code, because everything else is guessing. ▫️ The trajectory is legible. GPT made agents obvious, agents will make swarms obvious https://www.the-ai-corner.com/p/ai-agent-reliability-playbook?r=1krivi , and the signals exist, so reason from them instead of waiting for the market to confirm them. The stack changed, all of it. This is the best time in 60 years to be the person rebuilding it. Full lecture: Stanford CS153 Frontier Systems, Jensen Huang from NVIDIA on the Compute Behind Intelligence. If this breakdown saved you an hour, share it with one engineer, founder, or operator who needs to see it. Full lecture: Stanford CS153 Frontier Systems, Jensen Huang from NVIDIA on the Compute Behind Intelligence https://www.youtube.com/watch?v=tsQB0n0YV3k If this breakdown saved you an hour, share it with one engineer, founder, or operator who needs to see it. They will thank you later.