GLM-5.1 — zero NVIDIA, trained on 100,000 Huawei Ascend chips
Z.ai's GLM-5.1 was trained on 100,000 Huawei Ascend 910B chips, with zero NVIDIA GPUs. A near-frontier model proving hardware independence is real.
- [01] The Late-April 2026 Chinese LLM Stack — Dev Community 2026-05-08
- [02] DeepSeek V4 vs Kimi K2.6 vs GLM-5.1 — AI Stack Choice 2026-05-08
- [03] State of AI: May 2026 — Air Street Press 2026-05-08
In the GLM-5.1 model, a different detail is worth talking about more than the technical output: Z.ai trained this model on 100,000 Huawei Ascend 910B accelerators, with zero NVIDIA GPUs. A near-frontier-capacity model trained on China's domestic silicon is a signal that shifts the 5-year hardware-supply balance.
What was announced
Z.ai (formerly Zhipu AI) announced GLM-5.1 on March 27, then released the open weights on Hugging Face on April 8. The model scored 1535 on the GDPval-AA benchmark — just below DeepSeek V4 Pro (1554), above Kimi K2.6 (1484). The headline, however, is the training infrastructure: all pre-training + post-training was done on Huawei's Ascend 910B AI accelerators, with zero CUDA dependency. That's Z.ai's open claim, supported by extensive Huawei press materials.
What changed
Three layers of difference:
Hardware side: Until now "the Chinese AI ecosystem can't operate without NVIDIA" was the prevailing assumption. H800 export restrictions and after. GLM-5.1 broke that thesis — training a near-frontier model is possible on domestic silicon, at least in engineering terms.
Software side: The MindSpore + CANN (Compute Architecture for Neural Networks) stack, the alternative to PyTorch + CUDA, was tested for the first time at "1T parameters + 100K accelerators" scale. Not at the level of claims, but in real production.
Strategic side: The Chinese side's message of "we can do frontier on our own stack" speaks to both domestic and international markets. A reference design for "AI works without NVIDIA" sales in the Middle East, Southeast Asia, and Latin America.
First impressions
I haven't tested GLM-5.1 myself in depth — Hugging Face Inference Endpoint's Ascend backend is not yet public, and the official API endpoint doesn't provide consistent access from Turkish IPs. Community benchmarks describe it as a general-purpose model at "GPT-4 class" level, but with no clearly differentiating feature. The technical detail I'm following this time isn't the model — it's the silicon story behind it.
I'm watching this as an indie maker through this lens: NVIDIA H100 / H200 prices + lead times are out of reach for a solo developer anyway. But on a 5-10 year horizon, if Ascend 910B-class silicon drops cloud rental prices, it opens the door to local fine-tune and self-host options. That increases the chance of running models I own (independent of closed APIs) on Cubitz, Konnex-class "AI-native" features.
Practical impact
For developers: Direct impact today is limited. There's no useful Western-market infrastructure to test GLM-5.1 yet; most users will still go through the NVIDIA-CUDA route on V4 / Kimi.
For indie makers: A trend to watch, not act on yet. But over the next 18 months, Ascend cloud providers (Huawei Cloud, China Mobile Cloud) are likely to enter the global market with aggressive pricing — at that point the math changes.
For Turkey: An interesting note — Turkey's AI infrastructure is 95%+ NVIDIA-dependent and the FX wall is serious. Trial deployments of alternative silicon like Ascend in TR academia + large enterprise are close.
Limits and concerns
The "zero NVIDIA" claim demands transparency. Independently verifying that no fraction of Z.ai's training set sat on NVIDIA hardware is hard. The claim is strong but no independent audit yet.
Performance gap is still real. Training efficiency (FLOPS/Watt, time-to-train) of Ascend 910B + CANN is reported at roughly 40-60% of H100 + CUDA. That means 100K Ascend = ~50K effective H100. Closeable with engineering work, but not yet closed.
Software ecosystem risk. The PyTorch + CUDA world is 8 years of accumulated progress. MindSpore + CANN is much younger; community forums discuss stability issues at edge cases. Production-grade infrastructure still lives on the NVIDIA side.
Export / sanctions framework. Parallel to US export restrictions on silicon to China, the Chinese side's domestic-stack adoption may complicate sales of products to certain markets at the contract layer. Reading contracts is mandatory.
Bottom line
GLM-5.1's model itself is mid-tier good, but that wasn't the topic of this piece. We seem to have entered the year of the AI-infrastructure geopolitical decoupling: "NVIDIA isn't the only path" claim becomes concrete in 2026. Direct effect on your products today: none. But a trend that will shape 2027-2028 cloud prices. For an indie maker: watch, don't act yet.
Sources
The Late-April 2026 Chinese LLM Stack — Dev Community