Huawei AI Chips: What Ascend Means for China’s Compute Stack

May 31, 2026

Updated:

Artificial intelligence at scale requires enormous computing resources. In recent years, the United States has limited exports of advanced graphics processors to China, prompting Chinese firms to invest in domestic alternatives.

The Huawei AI chips narrative centers on the Ascend series produced by Huawei’s HiSilicon design arm. These chips are more than consumer-grade components; they represent China’s attempt to build an indigenous AI hardware and software stack resilient to sanctions and supplier risk.

As Beijing encourages self‑reliance, Huawei has become the country’s flagship supplier of AI accelerators. This article explores the latest generation of Ascend chips, the surrounding ecosystem, and the implications for China’s AI ambitions.

Why Huawei AI Chips Matter Now

Huawei executive presenting Ascend 910 AI processor on stage during product launch event

The importance of Huawei AI chips comes from timing and necessity. Chinese AI companies face limited access to advanced Nvidia GPUs due to export restrictions. As a result, they are moving quickly toward domestic alternatives.

Recent developments confirm this shift:

Chinese tech giants are actively securing Huawei chips for AI workloads
AI models like DeepSeek V4 are now optimized to run on Huawei hardware
Huawei expects AI chip revenue to grow significantly in 2026 due to rising demand

This demand signals something deeper. Huawei is no longer a backup option. It is becoming a primary computing provider inside China.

What Huawei Ascend Actually Represents

Huawei introduced Ascend AI accelerators in 2018. However, deployment scaled only recently due to export restrictions on foreign GPUs. The current mainstream device is the Huawei Ascend 910C.

Built on SMIC’s enhanced 7nm process, the 910C delivers about one-third of the BF16 throughput of Nvidia’s B200. To compensate, Chinese companies scale horizontally by clustering large numbers of chips. For example, DeepSeek R1 was trained on Nvidia H800 GPUs but runs inference on Huawei 910C clusters, showing strong support for inference workloads.

Despite using older HBM2E memory, the 910C reaches about 80 percent of Nvidia H20’s memory bandwidth. However, it remains two generations behind leading memory technology. This gap affects reasoning models where bandwidth is critical.

Software remains a key constraint. NVIDIA’s CUDA has decades of ecosystem support, while Huawei’s CANN is newer and less stable. Developers must rewrite code to move off CUDA, which slows adoption.

To address this, Huawei is open-sourcing CANN. The company plans to release all CANN operators on GitCode by September 2025.

Training Clusters and Infrastructure Scale

The real strength of Huawei Ascend chips becomes visible at the cluster level.

China has already deployed large-scale AI clusters built on Ascend chips:

A ten-thousand-card cluster has been activated using Huawei hardware
This system delivers massive compute capacity for training workloads

These clusters reveal a clear pattern. Instead of relying on a few high-performance GPUs, China is scaling compute horizontally using domestic chips.

This approach changes how performance is measured. It moves from single-chip benchmarks to cluster-level throughput.

System‑Level Innovation: CloudMatrix 384

Huawei CloudMatrix data center system showing large-scale AI compute clusters for model deployment

Huawei’s hardware strategy does not rely solely on individual chip performance. In July 2025, the company showcased the CloudMatrix 384 system, a rack‑scale supernode containing 384 Ascend 910C chips. Analysts described it as a direct competitor to Nvidia’s GB200 NVL72 system.

According to SemiAnalysis, the CloudMatrix system outperforms Nvidia’s NVL72 on some metrics despite using weaker individual chips. The performance improvement stems from Huawei’s system design, which uses a supernode architecture that allows chips to interconnect at very high speeds.

Rather than chasing single‑chip supremacy, Huawei achieves parity by connecting more chips and innovating at the system level. NVIDIA’s chief executive, Jensen Huang, acknowledged that Huawei has been moving quickly, citing CloudMatrix as evidence.

This system‑level strategy illustrates how domestic compute infrastructure is evolving. By designing its own interconnect protocols and memory pooling, Huawei reduces dependence on foreign technologies. The CloudMatrix platform is already operational on Huawei’s cloud. It serves as a reference architecture for Chinese data centers deploying large language models and generative AI services.

Next Generation Hardware: Ascend 950 and 950PR

Huawei presentation slide showing Ascend 950DT chip specifications for AI training and inference performance

Huawei has moved beyond the Ascend 910C with the 2026 launch of the Ascend 950PR. This chip delivers 1.56 petaflops of FP4 compute, targeting large-scale inference workloads where low-precision processing drives efficiency.

The architecture reflects a clear focus on throughput and scalability:

112 GB of HiBL 1.0 memory
1.4 terabytes per second memory bandwidth
2 terabytes per second interconnect via LingQu protocol

These specifications enable efficient scaling across large clusters. Huawei has also improved software compatibility. Around 80 percent of standard PyTorch inference workloads can run with minor adjustments through its CANN framework, which mirrors CUDA-level workflows.

Demand accelerated after DeepSeek optimized its V4 model for Huawei hardware. Major Chinese firms moved quickly to secure supply, confirming a shift toward domestic compute infrastructure.

The chip also introduces low-precision formats such as FP4, allowing more operations per second at lower cost. This design targets real-world deployment rather than peak benchmark performance.

There are still tradeoffs. The 950PR trails Nvidia’s top chips in memory bandwidth and operates at higher power levels. Manufacturing constraints also shape its design. Without access to advanced packaging technologies, Huawei uses a monolithic approach to maintain production stability.

Even with these constraints, the 950PR delivers competitive pricing and sufficient performance for large-scale inference. Huawei plans to ship hundreds of thousands of units in 2026, reinforcing its role as a primary compute supplier inside China.

Adoption and Market Dynamics

Adoption of Huawei AI chips is rising across China’s AI sector. Firms such as iFlytek, SenseTime, and China Mobile use Ascend chips for training, while DeepSeek runs R1 and V4 inference on Ascend hardware.

Procurement is scaling quickly. ByteDance plans to spend $5.6 billion on Huawei chips in 2026. Alibaba and Tencent have also placed large orders. Hyperscaler demand could reach $12-$15 billion.

Adoption remains uneven. Many firms still rely on Nvidia for training due to stronger software support and higher bandwidth. The Huawei Ascend 910c uses older memory, which limits training performance.

Software migration slows adoption. Developers must adapt code and lose access to mature CUDA tools. Some companies continue sourcing Nvidia GPUs through secondary markets.

Supply constraints persist. Export controls restrict SMIC production, which limits chip availability. Most companies use a hybrid model. Training stays on Nvidia systems, while inference shifts to Huawei AI chips.

Geopolitical Context and Implications

close-up of Nvidia GPU chip on motherboard used for AI training and high-performance computing

The rise of Huawei AI chip production is intertwined with geopolitical tensions. U.S. export controls, which began in 2022, restricted chips with compute and bandwidth above certain thresholds. NVIDIA responded with cut‑down models like the H800 and H20 for the Chinese market, but further tightening in 2023 and 2024 made even those chips difficult to obtain.

Huawei, already on the U.S. entity list since 2019, accelerated investment in domestic fabrication and packaging. Observers note that rather than hindering China’s AI aspirations, export controls have incentivized a push toward self‑sufficiency.

The emergence of the Ascend series also signals the creation of a separate Chinese AI stack. Huawei’s chips, CANN, and the MindSpore framework form a hardware–software ecosystem distinct from Nvidia’s CUDA. This divergence could lead to differences in model characteristics and developer workflows.

As domestic accelerators improve and more Chinese firms adopt them, the gap between the Chinese and global AI ecosystems may widen, potentially limiting interoperability and increasing barriers to collaboration. Analysts warn that U.S. policy must carefully calibrate export controls; overly restrictive measures could accelerate China’s self‑sufficiency and reduce U.S. influence.

Understand China’s AI Stack Before It Becomes the Default

Huawei’s push into AI chips is not an isolated story. It reflects a deeper shift in how China is building a full, vertically integrated technology stack, from chips to cloud to applications.

At ChoZan, we help global leaders decode these shifts and turn them into a strategic advantage.

Executive briefings on China’s AI and computing landscape
Deep-dive research on companies like Huawei, Alibaba, and DeepSeek
China innovation tours with direct exposure to real-world deployments
Advisory support to translate insights into actionable strategy

If you want to understand how China’s AI infrastructure is evolving and what it means for your business, this is where it starts.

Book a consultation with ChoZan to explore how these shifts impact your strategy.

Frequently Asked Questions

How do Huawei AI chips perform in real-world enterprise deployments?

Huawei AI chips perform best in large-scale, optimized environments where infrastructure is built around the Ascend architecture. Enterprises see stronger results in inference-heavy workloads, especially when paired with Huawei Cloud and pre-optimized models.

Are Huawei AI chips available outside China?

Huawei AI chips have limited availability outside China due to export restrictions and geopolitical concerns. Most deployments remain within China or in closely aligned markets where regulatory risk is lower.

What industries are adopting Huawei AI chips fastest?

Finance, telecom, energy, and government sectors are adopting Huawei AI chips most aggressively. These industries benefit from policy support and require secure, domestic compute infrastructure for large-scale AI deployment.

What is the Huawei Ascend 910C?

It is a seven‑nanometer AI accelerator produced by Huawei’s HiSilicon design arm. Each chip delivers roughly one‑third of Nvidia’s B200 BF16 throughput and offers about 80 percent of the memory bandwidth of Nvidia’s H20.

How does the Ascend 950PR compare to Nvidia GPUs?

The 950PR provides 1.56 petaflops of FP4 compute and features 112 gigabytes of proprietary memory with 1.4 terabytes per second of bandwidth. It outperforms Nvidia’s H20 in FP4 throughput but remains behind the B200 in memory bandwidth and power efficiency.

Why are Chinese firms adopting Huawei chips?

U.S. export controls restrict the availability of Nvidia’s most advanced GPUs. Chinese companies like DeepSeek, ByteDance, and Alibaba are turning to Huawei as a domestic alternative, ordering hundreds of thousands of Ascend chips for inference and, increasingly, for training.

Does CANN support existing AI frameworks?

Huawei’s CANN Next introduces a programming model similar to CUDA, enabling about 80 percent of standard PyTorch inference code to run with minimal changes. Training workloads and custom kernels still require more adaptation.

Will Huawei’s chips replace Nvidia’s in China?

It is too early to tell. Ascend chips are improving quickly, and the 950PR offers competitive inference performance. However, Nvidia still holds advantages in memory bandwidth, ecosystem maturity, and training performance. Many Chinese firms continue to use Nvidia GPUs for training and adopt Huawei chips mainly for inference.

What are the main risks of adopting Huawei AI chips for enterprises?

Key risks include software compatibility challenges, ecosystem maturity, and geopolitical exposure. Companies must also consider long-term support, talent availability, and integration complexity before adoption.

How are Chinese AI startups adapting to Huawei’s chip ecosystem?

Startups are increasingly designing models and applications around domestic hardware constraints. This leads to more efficient architectures that work within the bandwidth and compute limits of Huawei AI chips.

Join Thousands Of Professionals

By subscribing to Ashley Dudarenok’s China Newsletter, you’ll join a global community of professionals who rely on her insights to navigate the complexities of China’s dynamic market.

Don’t miss out—subscribe today and start learning for China and from China!