Alibaba’s Qwen AI models enable low-cost DeepSeek alternatives from Stanford, Berkeley

Name: Does the arrival of China’s low-cost DeepSeek mean the end of Nvidia’s chip dominance?
Uploaded: 2025-02-10T13:00:12.000Z
Duration: 5 min
Description: Does the arrival of China’s low-cost DeepSeek mean the end of Nvidia’s chip dominance?

Stanford’s S1 and Berkeley’s TinyZero are two examples of how researchers are increasingly using Alibaba tech to lower AI training costs

Reading Time:3 minutes

Why you can trust SCMP

Alibaba’s open-source Qwen2.5 models are increasingly being used by researchers to find ways of training AI models at even lower costs. Photo: Reuters

Ben Jiangin Beijing

Published: 9:00pm, 10 Feb 2025Updated: 9:54pm, 10 Feb 2025

The race to produce the cheapest top-performing artificial intelligence (AI) model is heating up with a new reasoning model from US computer scientists, including renowned Chinese-American “AI godmother” Li Feifei, that was trained for under US$50 on the back of Alibaba Group Holding’s open-source technology, following the breakthrough success of China’s DeepSeek.

The S1 reasoning model was developed on top of the Chinese e-commerce giant’s Qwen2.5-32b-Instruct model by researchers from Stanford University, where Li works, and the University of Washington, according to a research paper published last week.

The capabilities of Alibaba’s model are fresh evidence of how China is narrowing the AI gap with leading US players, following DeepSeek’s release of low-cost, high-performance open-source models that captured global attention. Hong Kong-listed shares of Alibaba, owner of South China Morning Post, gained 6 per cent on Monday.

After being trained with answers to 1,000 carefully curated questions and the “thinking process” distilled from Google’s Gemini Thinking Experimental model, the S1 model outperformed OpenAI’s o1-preview on maths and programming skills, according to the paper.

05:00

Does the arrival of China’s low-cost DeepSeek mean the end of Nvidia’s chip dominance?

The cost of running just the graphics processing units (GPUs) to develop S1 could be as low as US$14 based on the compute noted in the research, which says the model was trained for 26 minutes on 16 Nvidia H100s. These chips can be rented for US$2 per hour.