Chinese artificial intelligence (AI) start-up DeepSeek disclosed technical details about its low-cost, high-performance models, refuting allegations that it had misrepresented their costs while drawing cheers from the open-source community.
The Hangzhou-based research firm also made good on a promise that it would start releasing five open-source AI infrastructure projects this week. The company released two projects dubbed FlashMLA and DeepEP on Monday and Tuesday, both aimed at squeezing the best performance from chips for cost-efficient model training and inference tasks.
By open-sourcing the technique and the work it did for model training, DeepSeek is “effectively refuting the frequently made claim that ‘they lied’ about their training procedures”, Stephen Pimentel, chief technology officer at San Francisco-based AI industry solutions provider Dragonscale Industries, said in a post on X.
Open-source developers cheered DeepSeek’s new projects. “DeepSeek is once agains pushing the envelope on what’s possible with AI infrastructure,” said one commenter on X.
01:20
China’s Alibaba releases new AI model, said to outperform competitors Deepseek and OpenAI’s GPT-4o
China’s Alibaba releases new AI model, said to outperform competitors Deepseek and OpenAI’s GPT-4o
DeepSeek has released two groundbreaking open-source AI models – the V3 large language model and the R1 reasoning model – that rival some of the best proprietary counterparts from US AI powerhouses including Microsoft-backed OpenAI and Amazon-supported Anthropic.