Huawei Just Shipped 750,000 AI Chips That Are Faster Than Nvidia’s Best — And ByteDance Wrote a $5.6 Billion Check to Prove It

By Aryan Mehta · AI & Tools Editor at The Deep Wire

The most important chip launch of 2026 didn’t happen in Santa Clara. It happened in Shenzhen. Huawei’s Ascend 950PR entered mass production in March, and the numbers that are coming out of China right now should make Jensen Huang lose sleep. ByteDance alone committed $5.6 billion in orders. Alibaba is piling in. And the chip itself? It delivers 1.56 petaflops of AI inference performance — roughly 2.8 times the FP4 throughput of Nvidia’s H20, the best chip the US government still allows into China.

This isn’t a prototype announcement or a conference slide deck. Huawei is shipping 750,000 units this year. The Ascend 950PR is real silicon, in real data centers, running real workloads. And it just fundamentally changed the economics of the US-China chip war.

The Numbers That Break Nvidia’s China Thesis

Nvidia made $17.1 billion from China (including Hong Kong) in fiscal year 2025. That was already down from $20.3 billion the year before, thanks to export controls that killed the A100 and H100 in that market. Nvidia’s answer was the H20 — a deliberately neutered chip designed to comply with US restrictions while still giving Chinese customers something.

The problem: the H20 was always a stopgap, and Huawei just made it obsolete. Analyst consensus now projects Nvidia’s China revenue dropping to $12–14 billion in fiscal 2026 as the 950PR absorbs demand that would otherwise flow to the H20. That’s a $5–8 billion revenue hole — roughly the size of Nvidia’s entire gaming division.

Follow the money: Huawei expects AI chip revenue to hit $12 billion in 2026, up 60% from $7.5 billion in 2025. In other words, every dollar Nvidia is losing in China, Huawei is picking up. This isn’t market growth lifting both boats. This is a zero-sum replacement cycle, and Huawei is winning.

The CUDA Problem That Wasn’t

For years, the conventional wisdom was that Nvidia’s real moat wasn’t hardware — it was CUDA, the software ecosystem that every AI researcher on Earth learned to code in. Switching from Nvidia to anything else meant rewriting your entire stack. It was the strongest lock-in in all of computing.

Huawei just picked that lock. The Ascend 950PR ships with a CUDA-compatible software stack that dramatically lowers migration barriers. Chinese AI labs don’t have to choose between performance and rewriting their codebases. They get both. ByteDance’s $5.6 billion commitment isn’t just a hardware purchase — it’s a vote of confidence that Huawei’s software stack actually works at production scale.

The killer detail buried in the spec sheets: the 950PR runs the same model architectures, the same training frameworks, the same inference pipelines. DeepSeek V4’s launch reportedly drove a surge in Ascend 950 orders, because Chinese AI labs realized they could run frontier models on domestic hardware without meaningful performance penalties. The CUDA moat, it turns out, was a speed bump.

Who Gets Hurt — And It’s Not Just Nvidia

The obvious loser here is Nvidia’s China business, but the second-order effects are what matter. The total addressable market for AI accelerators in China is projected to reach $30–35 billion in 2026, according to TrendForce and SemiAnalysis estimates. If Huawei captures even 40% of that, it becomes one of the five largest semiconductor companies on the planet by AI revenue alone.

That changes the geopolitical calculus entirely. The entire theory behind US export controls was that restricting advanced chips would slow China’s AI development by 2–3 years. Instead, it created a $12 billion annual market for Huawei that didn’t exist three years ago. The sanctions didn’t kill China’s AI chip industry. They funded it.

Meanwhile, Huawei is already planning the Ascend 950DT for Q4 2026 — an upgraded version that will likely close whatever performance gaps remain. The development cycle is accelerating, not slowing down. Every quarter of export controls gives Huawei another quarter of captive domestic demand to fund R&D.

The Translation: What This Means for the AI Race

Here’s the quiet part nobody in Washington wants to say out loud: China no longer needs Nvidia to build frontier AI. Not for inference, which is where the real money is in 2026. Not for the workloads that actually run in production. The training gap still exists — Nvidia’s B200 and GB200 clusters remain ahead for training trillion-parameter models from scratch — but inference is 80% of commercial AI compute, and Huawei just became price-competitive there.

The numbers tell the story. ByteDance isn’t ordering 950PRs because of patriotism or government pressure. It’s ordering them because the total cost of ownership works. When you can get 2.8x the inference throughput of the H20 on domestic silicon with a compatible software stack, the procurement decision makes itself.

And that’s what makes this different from every previous “China chip challenger” story. This isn’t Huawei announcing vaporware at a government-sponsored conference. This is the largest consumer internet company in China writing a check bigger than the GDP of some countries because the product actually delivers.

The Verdict

The US spent three years trying to strangle China’s AI chip supply. The result: Huawei built a chip that’s faster than anything Nvidia is allowed to sell in China, made it software-compatible enough that switching costs are minimal, and secured enough orders to fund two more generations of development. Nvidia’s China revenue is in structural decline. Huawei’s is in structural ascent. The export control strategy didn’t fail quietly — it failed at $12 billion a year, in public, with receipts.

750,000 chips. $5.6 billion from ByteDance alone. CUDA compatibility. And an upgrade already in the pipeline. The chip war isn’t over, but the scoreboard just flipped.

Huawei Just Shipped 750,000 AI Chips That Are Faster Than Nvidia’s Best — And ByteDance Wrote a $5.6 Billion Check to Prove It

The Numbers That Break Nvidia’s China Thesis

The CUDA Problem That Wasn’t

Who Gets Hurt — And It’s Not Just Nvidia

The Translation: What This Means for the AI Race

The Verdict

Like this:

Related

The Numbers That Break Nvidia’s China Thesis

The CUDA Problem That Wasn’t

Who Gets Hurt — And It’s Not Just Nvidia

The Translation: What This Means for the AI Race

The Verdict

Share this:

Like this:

Related

Related Articles

Cerebras Just Filed a $3.5 Billion IPO With a $20 Billion OpenAI Deal in Its Back Pocket — And Nvidia Should Be Paying Attention

Tesla’s Robotaxis Now Drive Austin at Night Without a Human Behind the Wheel — And That Changes Everything

The US Government Just Tested DeepSeek’s Best Model — And Caught China Lying About How Good It Actually Is

Discover more from The Deep Wire