Nvidia Blackwell GPU Costs Double, But Per-Token Inference Expense Falls 35x vs Hopper

GateNews

2026-04-30 04:17:22

According to Nvidia’s latest blog analysis, Blackwell GPUs cost nearly double per hour compared to Hopper generation, yet deliver 35-fold lower per-token inference costs. Using DeepSeek-R1 as a test model, Blackwell (GB300 NVL72) rents at $2.65 per GPU per hour versus Hopper’s $1.41, but single GPU throughput jumps from 90 to 6,000 tokens per second. This 65x throughput gain reduces per-million-token costs from $4.20 to $0.12.

The $0.12 figure assumes full software optimization including FP4 low-precision inference and multi-token prediction (MTP). Without MTP enabled, per-million-token costs reach approximately $2.35, dropping to $0.11 with it active, demonstrating a 21x optimization impact from that feature alone.

View Source

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

OpenAI Announces GPT-5.5-Cyber Cybersecurity Model Launch in Coming Days

AI Industry News

According to Cointelegraph, Sam Altman announced OpenAI will launch GPT-5.5-Cyber, a cybersecurity model, in the coming days. The model will be made available to critical cybersecurity defenders, with OpenAI planning to establish trusted access mechanisms in collaboration with government entities to

GateNews3m ago

Google DeepMind CEO Predicts AGI Arrives by 2030, Giving Companies Four Years to Prepare

AI Industry News

According to Sequoia Capital's video series featuring Demis Hassabis, Google DeepMind's CEO predicts artificial general intelligence (AGI) will arrive by 2030. Hassabis, a 2024 Nobel Prize winner in chemistry, views this timeline as a strict business reality that forces leaders to rethink product de

GateNews7m ago

OpenAI to Operate as Perpetually Low-Profit Company, Signs 20-Year Power Deal

AI Industry News

According to Odaily, OpenAI CEO Sam Altman stated at Stripe Sessions that OpenAI aims to become a perpetually low-profit yet massive and fast-growing infrastructure company, offering products similar to an "intelligent meter" that users can purchase to automate business operations or embed into

GateNews8m ago

Forefront Tech Completes $100M IPO Pricing, to List on Nasdaq

Stocks AI Industry News

According to ChainCatcher, Forefront Tech Holdings Acquisition Corp completed a $100 million IPO pricing on April 30, set to list on Nasdaq under ticker FTHAU. The proceeds will be used to explore merger and acquisition opportunities in blockchain, artificial intelligence, fintech, and other

GateNews45m ago

Ex-Meta Executive Raises Funds for AI Startup Flourish at $2.5B Valuation

AI Industry News

According to Bloomberg, Thomas Reardon, former Meta executive who led work on Meta's Neural Band, is raising funds for Flourish, a new AI startup focused on low-power AI systems. Lux Capital and Google Ventures have agreed to be major investors, with backers already committing $500 million at a $2.5

GateNews47m ago

OpenAI Traces Goblin Problem to Nerdy Personality Reward Signal, Goblin Mentions Spike 175% in GPT-5.1

AI Industry News

According to OpenAI, the company identified the root cause of the "goblin" problem that plagued GPT models from GPT-5.1 onwards. A reward signal used to reinforce the "Nerdy" personality trait encouraged outputs containing fantasy creature references, with 76.2% of the training dataset showing

GateNews1h ago

Comment

0/400

No comments