At NVIDIA GTC 2026, Jensen Huang packed two hours of announcements into a keynote that most people, developers, investors, and gamers alike won’t have time to watch in full. Miss the details, and you’re working from incomplete information in a field where the specs genuinely change what’s buildable and what’s worth buying. This breakdown covers every major announcement from the GTC 2026 keynote in plain language: what each technology does, what the numbers actually mean, and why it matters. GTC 2026 ran March 16-19 at the SAP Centre in San Jose. As a data analyst reviewing the keynote recordings and official transcripts, the throughput numbers for the Vera Rubin platform stood out most the claims are significantly larger than previous generational jumps. The event produced 17 press releases in a single day and covered seven major hardware announcements: the Vera Rubin AI platform (shipping H2 2026), the Groq 3 LPU inference chip (ships Q3 2026), DLSS 5 neural rendering, the NemoClaw enterprise AI agent framework, Level 4 autonomous vehicle partnerships with BYD, Hyundai, Nissan, and Geely, and a $1 trillion purchase order figure that made the AI capex picture unusually concrete.

Key Takeaways
- Vera Rubin is NVIDIA’s NextGen AI platform 7 purpose-built chips, 336 billion transistors, and 35x the inference throughput of Blackwell at H2 2026 launch.
- Groq 3 LPU is a new inference specific chip born from NVIDIA’s $20 billion acquisition, handling response generation while Vera Rubin handles prompt reading.
- DLSS 5 moves AI rendering to the geometry level, a free driver update for RTX 50 series cards arriving autumn 2026.
- NemoClaw + OpenClaw are NVIDIA’s play for the enterprise AI agent operating system, analogous to Linux for agentic infrastructure.
- Jensen Huang cited $1 trillion in purchase orders for Blackwell and Vera Rubin systems through 2027 signed commitments, not projected demand.
What Is NVIDIA GTC 2026?
NVIDIA GTC (GPU Technology Conference) 2026 is NVIDIA’s annual developer and hardware summit, held March 16-19, 2026, at the SAP Centre in San Jose. CEO Jensen Huang delivered the keynote to approximately 10,000 in person attendees and millions watching online. The conference is where NVIDIA announces new hardware architectures, developer tools, and industry partnerships.
GTC is not a consumer product launch. The primary audience is cloud providers, enterprise data centre teams, AI researchers, and developers building on NVIDIA’s stack. When Jensen Huang announces a GPU platform at GTC, the buyers are Microsoft Azure and Meta, not individual consumers. Consumer products like DLSS and RTX cards are downstream effects of that hardware ecosystem.
GTC 2026 covered seven distinct announcement categories: AI supercomputing (Vera Rubin), inference chips (Groq 3 LPU), future GPU architecture (Feynman), gaming rendering (DLSS 5), enterprise agentic AI (NemoClaw/OpenClaw), physical AI autonomous vehicles and robotics and a financial demand forecast that set a record for visibility into AI capex spending.
Vera Rubin: NVIDIA’s New AI Supercomputer, Explained

The Vera Rubin platform is NVIDIA’s successor to Blackwell architecture, not a single GPU, but a rack-scale supercomputer built from seven purpose designed chips.
What’s Inside the Vera Rubin Platform?
The Rubin GPU packs 336 billion transistors on TSMC 3nm and delivers 288GB of HBM4 memory running at 22 TB/s bandwidth. That memory spec alone is more than double Blackwell’s HBM3e configuration. The Vera CPU pairs 88 custom Armv9.2 Olympus cores with up to 1.5TB of LPDDR5X memory, handling system coordination tasks that would otherwise consume GPU cycles.
In a standard NV Link 72 configuration, 72 Rubin GPUs operate as a single unified machine with no data copy overhead across the interconnect. Per NVIDIA’s GTC 2026 press release, the Vera Rubin platform delivers 35x higher throughput versus Blackwell at premium inference tiers, and 10x better inference per watt.
Microsoft Azure confirmed it has already deployed the first Vera Rubin rack. Anthropic, Meta, Mistral AI, and Open AI have all confirmed adoption plans. Those aren’t hypothetical customers Azure’s confirmation of a deployed rack means Vera Rubin is a past prototype.
Performance Jump vs. Blackwell
The 35x throughput figure requires context. This is measured at high-tier inference the most demanding prompt sizes and output generation not average workloads. That’s not a criticism; it’s how GPU benchmarking works across every generation. The relevant number for enterprise buyers is the watt figure: 10x better inference per watt translates directly to data centre operating costs.
| Metric | Blackwell (B200) | Vera Rubin | Improvement |
|---|---|---|---|
| GPU Memory | 192GB HBM3e | 288GB HBM4 | 50% more |
| Memory Bandwidth | 8 TB/s | 22 TB/s | 2.75x |
| High Tier Inference Throughput | Baseline | 35x | 35x |
| Inference Efficiency (per watt) | Baseline | 10x | 10x |
| Max NV Link Config | NVL72 | NVL72 (standard) | – |
| Transistor Count | ~208B | 336B | ~62% more |
Vera Rubin Ultra and the Kyber Rack
Beyond the standard NVL72, NVIDIA announced Vera Rubin Ultra, packaged in the Kyber rack. The headline Kyber number: it doubles the compute density per rack by scaling to 144 Rubin Ultra GPUs and introducing a second NV Link switch tier. This is the configuration targeting the largest model training runs and hyperscale inference clusters.
What Comes After Vera Rubin? (Ultra & Feynman Roadmap)
Vera Rubin ships H2 2026. Vera Rubin Ultra follows in 2027. Feynman covered below is targeted for 2028. NVIDIA has published a three generation forward roadmap, which is unusual for them. It’s a fairly direct bet that TSMC’s process schedule holds.
Groq 3 LPU: NVIDIA’s New Inference Chip Explained

The Groq 3 LPU (Language Processing Unit) is a chip designed exclusively for AI inference the process of running a trained model to generate responses.
What Is AI Inference and Why Does It Need Its Own Chip?
Training a model and running a model are fundamentally different computational tasks. Training is embarrassingly parallel to thousands of matrix multiplications happening simultaneously across months of computation. Inference has a sequential bottleneck: to generate each word in a response, the model must complete the previous step first. GPUs are extraordinarily good at parallel tasks. They’re less optimal for sequential token generation.
An LPU is designed specifically for that sequential bottleneck. Its memory architecture, clock design, and compute layout prioritize single-token generation speed over parallel throughput. The result is faster response generation at lower power than a GPU doing the same task.
The $20 Billion Acquisition Behind It
NVIDIA acquired Groq’s technology assets in December 2025 for approximately $20 billion, bringing on co-founder Jonathan Ross and president Sunny Madra. Within two months, the Groq 3 LPU moved into production an unusually fast integration timeline that suggests the acquisition included active chip designs, not just IP.
The original Groq chip had already demonstrated best in class inference speed on public benchmarks. NVIDIA’s contribution is integrating that design into its full software stack via Dynamo, its data centre orchestration layer.
How the Groq 3 LPU Works with Vera Rubin
Inference has two phases: prefill (the GPU reads your prompt) and decode (the chip generates each token of the response). The Vera Rubin GPU handles prefill. The Groq 3 LPU handles decode.
Working together through NVIDIA Dynamo software, the combination delivers 35x more throughput per megawatt compared to Blackwell GPUs running inference alone. A Groq 3 LPX rack holds 256 LPUs and is designed to sit beside Vera Rubin systems in data centres. Shipping is confirmed for Q3 2026.
Feynman: A Look at NVIDIA’s 2028 GPU
Feynman is NVIDIA’s next generation GPU architecture, targeting a 2028 production launch. It’s named after physicist Richard Feynman, continuing NVIDIA’s pattern of naming GPU platforms after scientists (Blackwell, Rubin, Feynman).
What We Know About Feynman
Feynman uses TSMC’s 1.6nm A16 process node one full node ahead of Vera Rubin’s 3nm. Per TSMC’s December 2025 A16 process announcement, the node delivers approximately 15-20% active power reduction and 8-10% speed improvement over N2P. For NVIDIA, that means more transistors at lower power, which translates directly to better inference efficiency.
The architectural addition that matters most: Feynman introduces 3D die stacking HBM memory stacked directly on the computer die for the first time in NVIDIA’s commercial GPU history. This eliminates the main bandwidth bottleneck that has constrained GPU memory throughput for three generations.
Feynman pairs with a Rosa CPU and an LP40 LPU co-developed with the Groq team. The top configuration NVL1152 will hold 1,152 Feynman GPUs in a single logical system, 8x the density of Vera Rubin Ultra’s Kyber rack. No throughput benchmarks have been released; NVIDIA’s roadmap disclosure is architecture and process node only at this stage.
DLSS 5: What It Means for PC Gaming

DLSS 5 is NVIDIA’s next generation AI rendering technology for RTX 50 series cards, arriving as a free driver update in autumn 2026.
What Is DLSS?
DLSS (Deep Learning Super Sampling) uses AI to reconstruct a high resolution image from a lower resolution rendered frame. The GPU renders at, say, 1080p, and DLSS fills in the detail to produce a 4K output. This trades a small amount of visual fidelity for a large gain in frame rate. DLSS 4 added Multi Frame Generation, which predicts and inserts AI generated frames between rendered ones. The results on supported titles were significant, effectively doubling or tripling frame rates at high settings.
What’s New in DLSS 5?
The change in DLSS 5 is architectural, not cosmetic. Previous DLSS versions operated as post-process filters; they received a finished rendered frame and upscaled it. DLSS 5 operates at the geometry stage, before final lighting and shading are calculated.
This matters because post-process upscaling can’t recover information that was never rendered. Geometry level AI generation can synthesize new geometric detail based on what should be there not just what was rendered. The practical effect is sharper edges, more accurate reflections, and better temporal stability at lower resolution inputs. DLSS 5 arrives as a free driver update for existing RTX 50 series cards, meaning no hardware purchase is required for current-generation owners.
Agentic AI: NemoClaw, OpenClaw, and What Comes Next

NVIDIA’s agent announcements at GTC 2026 are its play for the infrastructure layer of enterprise AI, the same position it holds for GPU based model training.
What Is OpenClaw?
OpenClaw is NVIDIA’s open source AI agent orchestration framework, released under an Apache 2.0 license. It defines the protocol layer for multiagent systems: how agents communicate, how they call tools, how they coordinate on tasks that span multiple steps or multiple specialized models. Jensen Huang described it as the equivalent of Linux for agentic AI, a foundational open standard that no single company controls.
The open source framing is deliberate. NVIDIA isn’t trying to charge for OpenClaw. It’s trying to make OpenClaw the default standard, so that NemoClaw the enterprise version becomes the obvious commercial choice the same way Red Hat became the commercial layer on Linux.
What Is NemoClaw?
NemoClaw is the enterprise hardened, commercially supported version of OpenClaw. It adds the features enterprise IT departments require: private network deployment (no data leaves the corporate firewall), access control, audit logging, compliance reporting, and formal SLA support. Companies running AI agents on sensitive internal data finance, healthcare, and legal can’t use a public API. NemoClaw is built for that constraint.
The platform currently integrates with Microsoft 365, Salesforce, and ServiceNow. NVIDIA hasn’t released pricing; enterprise licensing is handled through its cloud and reseller partners.
New Frontier Model Coalition Nemotron
NVIDIA’s Nemotron model family underpins both OpenClaw and NemoClaw. At GTC 2026, NVIDIA announced an expanded coalition of frontier model partners: Cosmos 2 (world simulation and physical AI), Groot 2 (humanoid robot control), and Alpamayo (reasoning and code). Each model is built for a specific agent task type rather than being a single general model trying to handle everything. The coalition approach means NemoClaw agents can route tasks to the best suited model rather than running all tasks through one system.
Self Driving Cars and Robotics: The Physical AI Leap

The GTC 2026 autonomous vehicle announcements were notable not for new hardware, but for the speed of deployment.
NVIDIA confirmed Level 4 autonomous vehicle partnerships full self driving capability without human supervision in defined operating domains with BYD, Hyundai, Nissan, and Geely. All four are in active production, not pilot. BYD’s partnership is particularly significant given the company’s global volume: BYD sold more vehicles than any other manufacturer in 2025.
Jensen Huang described the current moment as the “ChatGPT moment” for autonomous driving, the point where the technology transitions from research curiosity to commercial product. The hardware running these systems is NVIDIA Thor, an automotive SoC that combines an NVIDIA Blackwell GPU with an Arm-based CPU in a single automotive-grade chip.
On robotics: NVIDIA announced that its Isaac robotics platform now supports both the Groot 2 humanoid control model and NVIDIA’s simulation tools for training physical AI systems. More than 50 robotics companies have committed to building on Isaac, including Figure AI, Boston Dynamics, and Apptronik.
Vera Rubin Space 1 an orbital variant of the Vera Rubin platform deployed in low Earth orbit satellites was announced in partnership with a consortium of defence and commercial space operators. Specific payload specs and launch timelines were not disclosed at GTC.
The $1 Trillion Outlook: What Jensen’s Numbers Mean

$1 Trillion in Orders
The most quoted number from the keynote: Jensen Huang cited $1 trillion in purchase orders for Blackwell and Vera Rubin systems through 2027, per NVIDIA’s official GTC announcements. This figure is based on committed purchase orders across hyperscale’s and large enterprises not projected demand, but signed commitments.
For context: NVIDIA’s total revenue in fiscal year 2025 was approximately $130 billion, the majority from data center GPU sales. A $1 trillion order pipeline through 2027 represents roughly 7x that annual run rate compressed into a two-year window. The number tells you two things at once: how much money is flowing into AI infrastructure, and how completely NVIDIA controls the supply side of that spending.
CUDA Turns 20: Why It Still Matters
Worth noting separately: NVIDIA’s CUDA platform turned 20 at GTC 2026. Launched in 2006, CUDA is the programming model that lets developers write software for NVIDIA GPUs using a C like language. Every major AI framework TensorFlow, PyTorch, JAX is built on top of CUDA. The 20 year installed base of CUDA optimized code is NVIDIA’s deepest competitive moat. Competitors can build faster GPUs; they can’t replicate two decades of optimized software libraries without rebuilding from scratch.
NVIDIA used the milestone to announce expanded CUDA 20 tooling, including new profiling tools and a library of pre optimized inference primitives for Vera Rubin.
7x Speed from Software Alone
NVIDIA also demonstrated that Dynamo software upgrades and hardware changes have improved inference throughput on existing Blackwell systems by 7x since the original Blackwell launch. This matters for enterprise buyers: the hardware you buy today will be meaningfully faster six months from now through software optimization. It’s also a signal that Vera Rubin’s 35x figure will improve further after launch.
NVIDIA GTC 2026: Frequently Asked Questions
What was announced at NVIDIA GTC 2026? Seven major areas: the Vera Rubin AI supercomputer platform, the Groq 3 LPU inference chip, the Feynman GPU roadmap for 2028, DLSS 5 neural rendering, the NemoClaw and OpenClaw AI agent frameworks, Level 4 autonomous vehicle partnerships (BYD, Hyundai, Nissan, Geely), and a $1 trillion AI demand forecast through 2027.
What is the Vera Rubin AI supercomputer? Vera Rubin is NVIDIA’s next-generation AI computing platform with 7 purpose built chips, including the Rubin GPU (336 billion transistors, TSMC 3nm, 288GB HBM4) and the Vera CPU (88 core Armv9.2). It delivers 35x more high tier inference throughput than Blackwell and ships to cloud providers in H2 2026. The Vera Rubin Ultra configuration in the Kyber rack scales to 144 GPUs.
What is the Groq 3 LPU? A chip designed exclusively for AI inference, developed from NVIDIA’s December 2025 acquisition of Groq for approximately $20 billion. It handles the decode phase of inference generating tokens while Vera Rubin GPUs handle prefill. The combination delivers 35x more throughput per megawatt than Blackwell running inference alone. Ships Q3 2026.
What is DLSS 5? NVIDIA’s geometry level AI rendering technology for RTX 50 series cards. Where DLSS 4 processed a finished rendered frame, DLSS 5 operates before final shading, letting AI synthesize geometric detail that was never rendered. It arrives as a free driver update for existing RTX 50 series cards in autumn 2026.
What is NemoClaw? NVIDIA’s enterprise AI agent platform, built on the open source OpenClaw framework. It runs AI agents inside corporate networks without exposing private data to external APIs built for finance, healthcare, and legal teams that can’t use a public API. Integrates with Microsoft 365, Salesforce, and ServiceNow.
What is OpenClaw? NVIDIA’s open source AI agent orchestration framework, released under Apache 2.0 at GTC 2026. It defines how agents communicate, coordinate tasks, and call tools in multi agent systems. NemoClaw is the commercially supported, enterprise-hardened version with private deployment, compliance logging, and SLA support.
What is the Feynman GPU? NVIDIA’s 2028 GPU architecture, built on TSMC’s 1.6nm A16 process with 3D die stacking the first time NVIDIA has used die stacking in a commercial GPU. Scales to NVL1152, holding 1,152 GPUs as a single logical system. Pairs with a Rosa CPU and LP40 LPU codeveloped with the Groq team.
Conclusion
The GTC 2026 keynote wasn’t a single announcement, it was NVIDIA describing an infrastructure transition. The Vera Rubin platform shifts AI compute to a rack-scale architecture where 72 GPUs operate as one machine. The Groq 3 LPU splits inference into two optimized stages. NemoClaw and OpenClaw position NVIDIA as the operating system layer for enterprise AI agents. And the $1 trillion order figure makes the scale of that transition concrete.
If one number deserves your attention from the entire keynote, it’s the 10x inference efficiency improvement in Vera Rubin. Data centre power costs are the binding constraint on AI deployment right now and efficiency gains compound directly into what AI services can cost at scale.

