Chinese start-up DeepSeek launches AI model that outperforms Meta, OpenAI products

DeepSeek’s V3 model was trained for two months at a cost of US$5.58 million, using significantly fewer computing resources than its rivals

Reading Time:2 minutes

DeepSeek has developed an artificial intelligence model at a fraction of the capital outlay that bigger companies like Meta Platforms and OpenAI typically invest. Photo: Shutterstock

Published: 6:45pm, 27 Dec 2024Updated: 10:33am, 28 Jan 2025

Chinese start-up DeepSeek’s release of a new large language model (LLM) has made waves in the global artificial intelligence (AI) industry, as benchmark tests showed that it outperformed rival models from the likes of Meta Platforms and ChatGPT creator OpenAI.

The Hangzhou-based company said in a WeChat post on Thursday that its namesake LLM, DeepSeek V3, comes with 671 billion parameters and trained in around two months at a cost of US$5.58 million, using significantly fewer computing resources than models developed by bigger tech firms.

LLM refers to the technology underpinning generative AI services such as ChatGPT. In AI, a high number of parameters is pivotal in enabling an LLM to adapt to more complex data patterns and make precise predictions.

Reacting to the Chinese start-up’s technical report on its new AI model, computer scientist Andrej Karpathy – a founding team member at OpenAI – said in a post on social-media platform X: “DeepSeek making it look easy … with an open weights release of a frontier-grade LLM trained on a joke of a budget.”

Open weights refers to releasing only the pretrained parameters, or weights, of an AI model, which allows a third party to use the model for inference and fine-tuning only. The model’s training code, original data set, architecture details and training methodology are not provided.

The chatbot icons of DeepSeek and OpenAI’s ChatGPT are displayed on a smartphone screen. Photo: Shutterstock

DeepSeek’s development of a powerful LLM – at a fraction of the capital outlay that bigger companies like Meta and OpenAI typically invest – shows how far Chinese AI firms have progressed, despite US sanctions that have blocked their access to advanced semiconductors used for training models.

Chinese start-up DeepSeek launches AI model that outperforms Meta, OpenAI products

.css-1c6uqr6{color:inherit;font-weight:inherit;font-size:inherit;font-family:inherit;line-height:inherit;overflow-wrap:break-word;}DeepSeek’s V3 model was trained for two months at a cost of US$5.58 million, using significantly fewer computing resources than its rivals

DeepSeek’s V3 model was trained for two months at a cost of US$5.58 million, using significantly fewer computing resources than its rivals