Advertisement
Artificial intelligence
TechTech War

Zhipu AI breaks US chip reliance with first major model trained on Huawei stack

Zhipu claims GLM-Image achieved industry-leading scores among open-source models for text rendering and Chinese character generation

Reading Time:2 minutes
Why you can trust SCMP
3
Zhipu said the achievement proved the feasibility of developing powerful multimodal models without US chips. Photo: Shutterstock Images
Vincent Chow

Chinese artificial intelligence firm Zhipu AI said its new image generation model was trained on chips from Huawei Technologies, making it the first powerful open-source model to be developed on an entirely domestic training stack.

The Beijing-based company, fresh off its Hong Kong initial public offering, said on Wednesday that the achievement proved the feasibility of developing powerful multimodal models without US semiconductors, as Beijing pushes for self-reliance in China’s AI industry amid US export restrictions on cutting-edge chips.

According to Zhipu, the entire training pipeline for GLM-Image, from data preparation to the final training run, was conducted on Huawei’s Ascend Atlas 800T A2 server, incorporating the company’s in-house Ascend AI processors, and with MindSpore, Huawei’s all-in-one machine learning framework.

Advertisement

“We hope this can provide valuable reference for the community to explore the potential of domestic computing power,” Zhipu said.

Powerful multimodal AI models that can natively process text, voice, image and video are widely seen by industry experts as the next frontier of AI models.

Huawei’s booth at the World Artificial Intelligence Conference in Shanghai, July 26, 2025. Photo; Costfoto/NurPhoto via Getty Images
Huawei’s booth at the World Artificial Intelligence Conference in Shanghai, July 26, 2025. Photo; Costfoto/NurPhoto via Getty Images

Zhipu’s model has a hybrid architecture made up of both autoregressive and diffusion elements, a design that enables the multimodal capabilities pioneered by Google DeepMind’s Nano Banana Pro, which can accurately generate both images and text.

Advertisement
Advertisement
Select Voice
Choose your listening speed
Get through articles 2x faster
1.25x
250 WPM
Slow
Average
Fast
1.25x