DeepSeek kicks off 2026 with paper signalling push to train bigger models for less
DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture

Chinese artificial intelligence start-up DeepSeek has ushered in 2026 with a new technical paper, co-authored by founder Liang Wenfeng, that proposes a rethink of the fundamental architecture used to train foundational AI models.
The method – dubbed Manifold-Constrained Hyper-Connections (mHC) – forms part of the Hangzhou firm’s push to make its models more cost-effective as it strives to keep pace with better-funded US rivals with deeper access to computing power.
It also reflected the increasingly open, collaborative culture among Chinese AI companies, which have published a growing share of their research in public.
For industry watchers, DeepSeek’s papers often provide an important early signal of the engineering choices that will shape the start-up’s next major model release.
In the paper, released on Thursday, a team of 19 DeepSeek researchers said they tested mHC on models with 3 billion, 9 billion and 27 billion parameters, and found it scaled without adding significant computational burden.
“Empirical results confirm that mHC effectively … [enables] stable large-scale training with superior scalability compared with conventional HC (hyper-connections),” wrote the researchers, led by Zhenda Xie, Yixuan Wei and Huanqi Cao.