Daily AI: Jensen Huang Has 39,000 People in San Jose. The Rest of Us Should Be Paying Attention Too.

The Lead

NVIDIA's GTC 2026 opens today in San Jose with Jensen Huang's keynote at 11 a.m. Pacific, and the sheer density of what's expected to drop (Vera Rubin architecture, the NemoClaw enterprise agent platform, a peek at the Feynman inference chip, major robotics updates) makes this the single most consequential AI event of Q1. Thirty-nine thousand attendees from 190 countries. Over 700 workshops.

If you run a company that touches AI (which, at this point, is most of them), this week sets your roadmap context for the rest of the year.

The Deep Dive: The Inference Economy Is Here

Every GTC tells you something about where NVIDIA thinks the money is moving. Last year it was training scale. This year, the unmistakable signal is inference.

Vera Rubin, NVIDIA's next-generation GPU architecture, is reportedly packing up to 288GB of HBM4 memory with a promised 10x reduction in inference token costs. That number matters more than any benchmark score. When you cut the cost of running AI by an order of magnitude, you don't just make existing applications cheaper. You make entirely new categories of applications viable. The agent workflow that was too expensive to run continuously at $0.03 per call becomes a no-brainer at $0.003.

This is the shift I've been watching for. With Scorpiox, I've been building multi-agent systems where the bottleneck isn't the model's intelligence. It's the economics of keeping agents running persistently. Every orchestration layer, every tool call, every reasoning chain burns tokens. Vera Rubin doesn't solve the architecture challenges, but it removes the cost ceiling that's been forcing builders to make ugly tradeoffs between capability and budget.

Then there's NemoClaw, NVIDIA's open-source enterprise agent platform. This is their play to own the orchestration layer, giving companies a framework to deploy AI agents that execute multi-step tasks across enterprise systems. Think of it as NVIDIA saying: we sold you the GPUs to train the models, now here's the platform to actually put them to work. The positioning is smart. The agent infrastructure layer is wide open right now, and whoever provides the default scaffolding that Fortune 500 IT teams reach for first wins a massive distribution advantage.

The third piece is Feynman, the inference-first chip architecture built on TSMC's 1.6nm process. While Vera Rubin is the near-term play, Feynman is the 2028 bet, purpose-built for long-context, multi-step reasoning. The kind of sustained compute that AI agents actually need. NVIDIA is telegraphing that they see the future of AI compute as fundamentally different from the training-dominated past. Inference isn't a side business. It's the business.

For anyone building on top of these models, this is the week to update your assumptions about what's economically feasible. The applications you shelved six months ago because the unit economics didn't work? Time to revisit them.

Also Worth Knowing

Morgan Stanley says the grid can't keep up. A report released Friday projects a U.S. power shortfall of 9 to 18 gigawatts through 2028: a 12% to 25% gap between what AI infrastructure needs and what the grid can deliver. Data center demand could hit 74 GW by 2028 with a projected 49 GW shortfall. Developers are already converting Bitcoin mining operations to AI compute centers and spinning up natural gas turbines. The AI boom has an energy problem, and it's not theoretical anymore. If you're evaluating cloud providers or colocation partners, power availability just became a top-three selection criterion.
Cognizant's research confirms what practitioners already know: plug-and-play AI is a myth. Their study of 600 AI decision makers found that 63% of enterprises report moderate-to-large gaps between AI ambitions and current capabilities. The most damning finding: companies cite generic, off-the-shelf AI solutions as a leading reason to reject an AI provider. Custom solutions and flexible engagement models rank as the most important factor when selecting a partner, ahead of pricing and time to value. This tracks with everything I see in the field. The companies getting real results are building, not buying prepackaged.
a16z dropped their 6th edition of the Top 100 Gen AI Consumer Apps. ChatGPT still dominates at 900 million weekly active users, 2.7x larger than Gemini on web. But the interesting signal is at the edges: Claude grew paid subscribers over 200% year-over-year, and the report now includes AI-native features in products like Canva, Notion, and Grammarly. The "thin wrapper" era is dying. The products winning are the ones where AI is deeply integrated into the workflow, not bolted on top.

The Builder's Take

Here's what I keep coming back to as I watch GTC kick off: the constraint is shifting.

For the last two years, the constraint was capability. Could the models do the thing? Could they reason well enough, write code reliably enough, handle the context window you needed? That era isn't over, but it's maturing fast. GPT-5.4 shipped two weeks ago with native computer-use capabilities and a million-token context window. Gemini 3.1 Pro is winning 13 of 16 major benchmarks. The models are good enough for an enormous range of real business tasks.

The new constraint is deployment economics and integration complexity. How cheaply can you run inference at scale? How do you wire agents into your actual systems: your CRM, your ERP, your messy internal tools? How do you monitor and govern AI that's doing real work, not just answering questions in a chat window?

That's what makes this GTC different. NVIDIA isn't just showing off faster chips. They're showing off the plumbing: NemoClaw for orchestration, Vera Rubin for cost-effective inference, Feynman for sustained agent compute. They're building the infrastructure stack for a world where AI does work, not just thinks.

If you're a builder, a founder, or a leader trying to figure out where to place your bets this quarter, here's my take: stop waiting for the next model to unlock your use case. The models are here. Start investing in the integration layer: the connectors, the workflows, the monitoring, the governance. That's where the real competitive advantage lives now, and it's where most companies are still embarrassingly behind.

The companies that win this next phase won't be the ones with access to the best model. Everyone has access to the best models. The winners will be the ones who figured out how to actually wire AI into how their business operates. That's not a technology problem. That's an operations problem.

Keep building,

— JW

Daily AI: Jensen Huang Has 39,000 People in San Jose. The Rest of Us Should Be Paying Attention Too.

Jason Block

The Lead

If you run a company that touches AI (which, at this point, is most of them), this week sets your roadmap context for the rest of the year.

The Deep Dive: The Inference Economy Is Here

Also Worth Knowing

The Builder's Take

Read more

The Audit Nobody Wants to Run

MCP Just Handed Every Travel Nerd a Loaded Weapon

Stop Building in Single File

Daily AI: Jensen Huang Has 39,000 People in San Jose. The Rest of Us Should Be Paying Attention Too.

Jason Block

The Lead

﻿ If you run a company that touches AI (which, at this point, is most of them), this week sets your roadmap context for the rest of the year.

The Deep Dive: The Inference Economy Is Here

Also Worth Knowing

The Builder's Take

Read more

The Audit Nobody Wants to Run

MCP Just Handed Every Travel Nerd a Loaded Weapon

Stop Building in Single File

If you run a company that touches AI (which, at this point, is most of them), this week sets your roadmap context for the rest of the year.