-
Advertisement
Artificial intelligence
TechBig Tech

When context is everything, AI models still struggle in the real world: Tencent

Despite rapid advances, today’s top AI models are still “brittle” in messy, real-world environments, according to new Tencent research

Reading Time:2 minutes
Why you can trust SCMP
Tencent research says that AI models will remain ‘brittle’ until contextual learning is conquered. Photo: Shutterstock
Vincent Chow
Leading US and Chinese artificial intelligence models are frustrating to use in real-world settings because they struggle to learn from context, Tencent Holdings said in a new technical paper – the first co-authored by Vinces Yao Shunyu since he took up the role of chief AI scientist at the firm.

AI developers need to place “context learning” at the centre of future model design if their products are to become genuinely useful outside controlled environments, according to researchers from Tencent and Fudan University’s Institute of Trustworthy Embodied AI.

“Models often fail in subtle but consequential ways,” the researchers wrote in a paper published on Tuesday. “Until [context learning] improves, [models] will remain brittle precisely in the settings where we most want them to help: messy, dynamic, real-world environments.”
Advertisement

The research comes as Yao – a former star researcher at OpenAI – seeks to reinvigorate Tencent’s foundational model efforts after a series of internal restructurings.

The Shenzhen-based conglomerate’s Hunyuan models trail domestic rivals such as DeepSeek while its flagship consumer AI app Yuanbao had roughly half the number of users of ByteDance’s market-leading Doubao as of January.

Chinese tech giant Tencent poached Vinces Yao Shunyu from OpenAI in 2025. Photo: Handout
Chinese tech giant Tencent poached Vinces Yao Shunyu from OpenAI in 2025. Photo: Handout

To test levels of context learning ability among existing models, Tencent’s researchers developed a new benchmark called CL-bench, testing 19 leading models across 1,899 tasks designed to measure on-the-job learning.

Advertisement
Advertisement
Select Voice
Select Speed
1.00x