DeepSeek V4 Launches with 1M Context Window; Huawei Ascend and Cambricon Chips Achieve Full Compatibility

Gate News message, April 24 — DeepSeek V4-Pro and DeepSeek V4-Flash were officially released and open-sourced on April 24, with context processing length significantly expanded from 128K to 1M, representing nearly a 10-fold capacity increase. Huawei Computing announced that its Ascend supernode products fully support DeepSeek V4 series models through close collaboration between chip and model technologies.

Huawei Ascend 950 achieves high-throughput, low-latency DeepSeek V4 model inference deployment through fused kernel and multi-stream parallelism techniques to reduce Attention computation and memory access overhead. For DeepSeek V4-Pro with 8K input, Ascend 950 achieves approximately 20ms TPOT with 4,700 TPS single-card Decode throughput; for DeepSeek V4-Flash under 8K input, it reaches approximately 10ms TPOT with 1,600 TPS throughput. Ascend A3 supernode series also achieves full compatibility, with training reference implementations provided for rapid fine-tuning. Based on Ascend A3 64-card supernode with large EP mode, DeepSeek V4-Flash achieves over 2,000 TPS single-card Decode throughput in 8K/1K input-output scenarios using vLLM inference engine. Huawei’s full Ascend A2, A3, and 950 product lines support both DeepSeek V4-Flash and V4-Pro.

Huawei Cloud announced first-mover compatibility with DeepSeek V4, providing developers with one-click API token services through its MaaS platform. Huawei Cloud optimized system layer, operator layer, and cluster layer capabilities to ensure rapid model adaptation and high-performance deployment. Enterprises including Kingsoft WPS and 360 have already integrated DeepSeek’s new model via Huawei Cloud.

Cambricon also announced Day 0 compatibility with DeepSeek V4-Flash and V4-Pro based on the vLLM inference framework, with adaptation code open-sourced to the GitHub community. Cambricon previously achieved first-mover adaptation when DeepSeek V3.2 was released last year, having conducted deep software-hardware collaborative performance optimization on DeepSeek series models.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Google Plans to Invest Up to $40 Billion in Anthropic, Pledges 5+ Gigawatts of Computing Power

Gate News message, April 24 — According to market sources, Google plans to invest up to $40 billion in Anthropic. The investment will begin with an initial $10 billion in cash, followed by up to an additional $30 billion contingent on achieving specified milestones. Google will also provide Anthrop

GateNews13m ago

Swiss Regulator FINMA Warns That Anthropic's Mythos AI Tool Poses Financial Risk

Gate News message, April 24 — Switzerland's Financial Market Supervisory Authority (FINMA) has stated that allowing banks to rapidly and conveniently adopt Anthropic's artificial intelligence tool Mythos would pose serious risks to the

GateNews35m ago

Fere AI Completes $1.3M Funding Round Led by Ethereal Ventures

Gate News message, April 24 — Fere AI, an AI-powered digital asset trading agent platform, announced the completion of a $1.3 million funding round led by Ethereal Ventures, with Galaxy Vision Hill and Kosmos Ventures participating. The platform supports cross-chain networks including Ethereum,

GateNews1h ago

Anthropic Rolls Back Claude Code Changes After Quality Decline; All Fixes Complete

Gate News message, April 24 — Anthropic has acknowledged a recent decline in Claude Code quality and confirmed that all related issues have been resolved through rollbacks and fixes. The problems stemmed from three product and prompt adjustments made between early and mid-April. On March 4, the

GateNews3h ago

NeoSoul Co-Founder Kaelan: AI Industry Should Allow Toys to Exist, Innovation Often Starts as Experimental Products

Gate News message, April 24 — At a recent Hong Kong forum on intelligent encrypted finance, NeoSoul co-founder Kaelan shared insights on evaluating AI projects in the early-stage, rapidly evolving AI industry. Beyond assessing current products, teams must demonstrate the ability to keep pace with un

GateNews3h ago

Meta to Deploy Tens of Millions of AWS Graviton5 Chips in Multi-Year Billion-Dollar Deal

Gate News message, April 24 — Meta has signed a multi-year agreement with Amazon Web Services to deploy tens of millions of Graviton5 processors for its AI infrastructure, making Meta one of AWS's largest Graviton customers globally. The deal, spanning three to five years, is valued at billions of d

GateNews3h ago
Comment
0/400
No comments