Huang Renxun announced at CES that “the ChatGPT moment for Physical AI is coming soon.” NVIDIA released Vera Rubin platform with a 5x increase in computing power, open-sourced Alpamayo autonomous driving, Cosmos world model, and Groot robot. Emphasizing that two platform migrations are happening simultaneously: AI redoing applications, Physical AI redoing reality.
Three Major Breakthroughs of the Vera Rubin Hardware Revolution
At the start of his CES speech, Huang Renxun said, “We have 15 kilograms of content to pack into today’s speech. There are 3,000 people in the venue, another 2,000 in the courtyard outside watching, people on the fourth floor are online, and hundreds of millions worldwide are watching.” But he quickly shifted the focus back to hardware, announcing that Vera Rubin is now fully in production.
The name Vera Rubin comes from the 20th-century astronomer who observed that the rotation speeds at the edges of galaxies are similar to those at the center, which contradicts Newtonian physics unless unseen matter—dark matter—exists. NVIDIA named its next-generation computing platform Vera Rubin because the “invisible stuff” they face is also expanding: computational demand.
Huang Renxun described the insane growth in computing needs. Model sizes grow tenfold every year, reasoning after o1 becomes “thinking process,” and post-training introduces reinforcement learning, causing a surge in computational volume; test-time scaling increases each inference token by 5 times; and each new frontier pushes the previous generation’s token cost down by tenfold annually. This illustrates fierce competition—NVIDIA must “advance the state-of-the-art in computation every year, without fail.”
Vera Rubin’s Six-Core Collaborative Architecture
Custom CPU Vera: Performance doubles, sharing data bidirectionally with Rubin GPU from the start.
Rubin GPU: A single compute board can reach 100 PFLOPS of AI compute power, five times that of the previous generation.
Networking and Security: ConnectX-9 provides 1.6 Tbps lateral bandwidth; BlueField-4 DPU offloads storage and security.
A compute board contains 17,000 components, with the compute tray redesigned to be “wireless, pipe-less, fan-less.” Sixth-generation NVLink switches connect 18 nodes into one, then expand to 200G, with integrated optical packaging, forming “AI factories” from thousands of cabinets. For training tasks with 10 trillion parameters and 100 trillion tokens, Rubin requires only a quarter of Blackwell’s system scale to complete training within the same month window.
Huang Renxun also mentioned real-world industry challenges. NVIDIA originally had a rule: at most one or two chips per new product to avoid overcomplicating the supply chain. But after Moore’s Law slowed, transistor growth couldn’t keep pace with the 10x model, 5x token, and 10x cost reductions. Without co-design, it would be impossible to catch up, so this generation had to redesign every chip. Rubin GPU’s floating-point performance is five times Blackwell’s, but its transistors are only 1.6 times more. This means process scaling has hit a ceiling; only through architecture and system-level co-design can performance be improved.
Collective Leap Strategy for Open-Source Models
Over the past year, what truly excites Huang Renxun isn’t the progress of any closed-source model but the collective leap of open-source models. He said that open models are now about “six months behind the frontier,” but every six months, smarter new models emerge, leading to explosive downloads because startups want to participate, big companies want in, researchers want to contribute, students want to learn—almost every country wants to join.
Many misunderstand NVIDIA, thinking it’s just “selling GPUs,” but Huang Renxun repeatedly emphasized at CES that NVIDIA is becoming a builder of cutting-edge open-source models, and they are fully open. He listed several NVIDIA open-source models: from the hybrid Transformer-SSM Nemotron, the world model Cosmos, to humanoid robot Groot. Even in autonomous driving, Alpamayo is not only open-source but also shares training data.
Huang Renxun announced Alpamayo as “the world’s first thinking and reasoning autonomous driving AI.” It is end-to-end from camera to actuator, trained on large amounts of real-world mileage “demonstrated by humans,” as well as mileage generated by Cosmos, plus “hundreds of thousands” of meticulously annotated samples. The key is it not only outputs steering, brake, and throttle commands but also explains what actions it will take, why, and what the trajectory is.
He said they started autonomous driving eight years ago because they early on predicted deep learning would overhaul the entire compute stack. Alpamayo is open-sourced “today,” and this project is massive—Huang Renxun said their AV team has “several thousand people,” and Mercedes-Benz has collaborated with them for five years. He predicts that in the future, there could be 1 billion autonomous vehicles, “each with thinking capabilities, each driven by AI.”
Three Computer Architectures for Physical AI
Huang Renxun spent a long time discussing one thing: making AI understand common sense about the world is much more difficult than language. Object permanence, causality, inertia, friction, gravity—these are intuitive for human children but completely unfamiliar to AI. And real-world data is always insufficient. He said to do physical AI, you need “three computers”: one for training models, one for inference at the edge on cars/robots/factories, and one for simulation.
Omniverse is a digital twin simulation world; Cosmos is a foundational world model; robot models include Groot and Alpamayo. Where does training data come from? Language models have vast text data, real-world videos are plentiful but far from enough to cover diverse interactions. So they generate synthetic data consistent with physical laws, selectively creating training samples.
Cosmos can generate realistic videos from a single image, produce coherent motion from 3D scene descriptions, generate surround-view videos from sensor logs, and create “edge cases” from scene prompts. It can also close-loop simulate: actions are performed, the world responds, and Cosmos infers the next step. Huang Renxun defines autonomous driving as the first “large-scale mainstream market” for physical AI and asserts that “the inflection point is now,” with a large proportion of cars worldwide becoming highly automated in the next decade.
He believes future AI will not just be multimodal but “multi-model.” The smartest systems should call the most suitable model for different tasks. The most realistic systems will naturally be multi-cloud, hybrid cloud, and edge computing. This implies that the essence of AI applications is a scheduling and reasoning architecture—an intelligent agent capable of judging intent, selecting models, calling tools, and combining results. Huang Renxun sees this kind of agent becoming the new “user interface,” no longer just Excel, forms, or command line.
NVIDIA is experiencing a paradoxical moment. It appears to be the most dangerous among all bubble narratives, with a market cap reaching unprecedented heights. But it also looks like the strongest among all AI concept stocks, with quarterly revenue and net profit in the hundreds of billions of dollars, and gross margins reaching 70%. When a company controls both the technological narrative, a key position in the industry chain, and attracts financial market attention, it becomes a totem. The advantage of a totem is that faith brings a premium; the downside is that the premium means you cannot make mistakes.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Jensen Huang CES Shocks Announcement: The Moment for Physical AI's ChatGPT is Coming Soon
Huang Renxun announced at CES that “the ChatGPT moment for Physical AI is coming soon.” NVIDIA released Vera Rubin platform with a 5x increase in computing power, open-sourced Alpamayo autonomous driving, Cosmos world model, and Groot robot. Emphasizing that two platform migrations are happening simultaneously: AI redoing applications, Physical AI redoing reality.
Three Major Breakthroughs of the Vera Rubin Hardware Revolution
At the start of his CES speech, Huang Renxun said, “We have 15 kilograms of content to pack into today’s speech. There are 3,000 people in the venue, another 2,000 in the courtyard outside watching, people on the fourth floor are online, and hundreds of millions worldwide are watching.” But he quickly shifted the focus back to hardware, announcing that Vera Rubin is now fully in production.
The name Vera Rubin comes from the 20th-century astronomer who observed that the rotation speeds at the edges of galaxies are similar to those at the center, which contradicts Newtonian physics unless unseen matter—dark matter—exists. NVIDIA named its next-generation computing platform Vera Rubin because the “invisible stuff” they face is also expanding: computational demand.
Huang Renxun described the insane growth in computing needs. Model sizes grow tenfold every year, reasoning after o1 becomes “thinking process,” and post-training introduces reinforcement learning, causing a surge in computational volume; test-time scaling increases each inference token by 5 times; and each new frontier pushes the previous generation’s token cost down by tenfold annually. This illustrates fierce competition—NVIDIA must “advance the state-of-the-art in computation every year, without fail.”
Vera Rubin’s Six-Core Collaborative Architecture
Custom CPU Vera: Performance doubles, sharing data bidirectionally with Rubin GPU from the start.
Rubin GPU: A single compute board can reach 100 PFLOPS of AI compute power, five times that of the previous generation.
Networking and Security: ConnectX-9 provides 1.6 Tbps lateral bandwidth; BlueField-4 DPU offloads storage and security.
A compute board contains 17,000 components, with the compute tray redesigned to be “wireless, pipe-less, fan-less.” Sixth-generation NVLink switches connect 18 nodes into one, then expand to 200G, with integrated optical packaging, forming “AI factories” from thousands of cabinets. For training tasks with 10 trillion parameters and 100 trillion tokens, Rubin requires only a quarter of Blackwell’s system scale to complete training within the same month window.
Huang Renxun also mentioned real-world industry challenges. NVIDIA originally had a rule: at most one or two chips per new product to avoid overcomplicating the supply chain. But after Moore’s Law slowed, transistor growth couldn’t keep pace with the 10x model, 5x token, and 10x cost reductions. Without co-design, it would be impossible to catch up, so this generation had to redesign every chip. Rubin GPU’s floating-point performance is five times Blackwell’s, but its transistors are only 1.6 times more. This means process scaling has hit a ceiling; only through architecture and system-level co-design can performance be improved.
Collective Leap Strategy for Open-Source Models
Over the past year, what truly excites Huang Renxun isn’t the progress of any closed-source model but the collective leap of open-source models. He said that open models are now about “six months behind the frontier,” but every six months, smarter new models emerge, leading to explosive downloads because startups want to participate, big companies want in, researchers want to contribute, students want to learn—almost every country wants to join.
Many misunderstand NVIDIA, thinking it’s just “selling GPUs,” but Huang Renxun repeatedly emphasized at CES that NVIDIA is becoming a builder of cutting-edge open-source models, and they are fully open. He listed several NVIDIA open-source models: from the hybrid Transformer-SSM Nemotron, the world model Cosmos, to humanoid robot Groot. Even in autonomous driving, Alpamayo is not only open-source but also shares training data.
Huang Renxun announced Alpamayo as “the world’s first thinking and reasoning autonomous driving AI.” It is end-to-end from camera to actuator, trained on large amounts of real-world mileage “demonstrated by humans,” as well as mileage generated by Cosmos, plus “hundreds of thousands” of meticulously annotated samples. The key is it not only outputs steering, brake, and throttle commands but also explains what actions it will take, why, and what the trajectory is.
He said they started autonomous driving eight years ago because they early on predicted deep learning would overhaul the entire compute stack. Alpamayo is open-sourced “today,” and this project is massive—Huang Renxun said their AV team has “several thousand people,” and Mercedes-Benz has collaborated with them for five years. He predicts that in the future, there could be 1 billion autonomous vehicles, “each with thinking capabilities, each driven by AI.”
Three Computer Architectures for Physical AI
Huang Renxun spent a long time discussing one thing: making AI understand common sense about the world is much more difficult than language. Object permanence, causality, inertia, friction, gravity—these are intuitive for human children but completely unfamiliar to AI. And real-world data is always insufficient. He said to do physical AI, you need “three computers”: one for training models, one for inference at the edge on cars/robots/factories, and one for simulation.
Omniverse is a digital twin simulation world; Cosmos is a foundational world model; robot models include Groot and Alpamayo. Where does training data come from? Language models have vast text data, real-world videos are plentiful but far from enough to cover diverse interactions. So they generate synthetic data consistent with physical laws, selectively creating training samples.
Cosmos can generate realistic videos from a single image, produce coherent motion from 3D scene descriptions, generate surround-view videos from sensor logs, and create “edge cases” from scene prompts. It can also close-loop simulate: actions are performed, the world responds, and Cosmos infers the next step. Huang Renxun defines autonomous driving as the first “large-scale mainstream market” for physical AI and asserts that “the inflection point is now,” with a large proportion of cars worldwide becoming highly automated in the next decade.
He believes future AI will not just be multimodal but “multi-model.” The smartest systems should call the most suitable model for different tasks. The most realistic systems will naturally be multi-cloud, hybrid cloud, and edge computing. This implies that the essence of AI applications is a scheduling and reasoning architecture—an intelligent agent capable of judging intent, selecting models, calling tools, and combining results. Huang Renxun sees this kind of agent becoming the new “user interface,” no longer just Excel, forms, or command line.
NVIDIA is experiencing a paradoxical moment. It appears to be the most dangerous among all bubble narratives, with a market cap reaching unprecedented heights. But it also looks like the strongest among all AI concept stocks, with quarterly revenue and net profit in the hundreds of billions of dollars, and gross margins reaching 70%. When a company controls both the technological narrative, a key position in the industry chain, and attracts financial market attention, it becomes a totem. The advantage of a totem is that faith brings a premium; the downside is that the premium means you cannot make mistakes.