DeepSeek and China's AI development

Jan 24, 2025

Chinese company DeepSeek (深度求索）attracted global attention after it recently released its R1 model, using advanced techniques such as pure reinforcement learning to create a model that's not only among the most formidable in the world, but is fully open source, making it available for anyone in the world to examine, modify, and build upon.

The Hangzhou-based company was founded by a group of young people and with private funds. according to an article of Southern Weekly (南方周末）, the DeepSeek team is not large, with less than 140 members. Almost all of the engineers and developers come from top universities in China such as Tsinghua University, Peking University, Sun Yat-sen University, and Beijing University of Posts and Telecommunications. There are few who graduated from overseas universities. Moreover, they have not worked for many years, and quite a few are still PhD candidates.

The team managers are very young. The CEO and founder Liang Wenfeng was born in 1985 in an ordinary family in Zhanjiang, Guangdong Province. His father was an elementary school teacher. Liang has never studied abroad and does not hold a doctoral degree. He completed both his undergraduate and postgraduate studies at Zhejiang University, obtaining a master's degree in Information and Electronic engineering.

(Photo: Liang Wenfeng (up-left) and some of his team members)

Since 2008, Liang has been leading teams to explore fully automated quantitative trading using technologies such as machine learning. In 2015, he became a co-founder of High-Flyer. In 2019, the scale of High-Flyer’s asset management exceeded 10 billion yuan. In 2021, High-Flyer became the first Chinese company to break through the 100-billion - yuan mark in asset management scale.

High-Flyer started applying AI technology to investment management at an early stage. On October 21, 2016, High-Flyer launched its first AI model, and the first trading position generated by deep learning went live for execution, with calculations performed using GPUs. In 2017, Huanshu Quant claimed to have fully AI - enabled its investment strategies.

Starting from 2020, High-Flyer's AI supercomputer "Yinghuo - 1", which had a cumulative investment of over 100 million yuan and an area equivalent to a basketball court, was officially put into operation. It is said to have a super-computing power comparable to 40,000 personal computers.

In a paper Liang Wenfeng participated in 2021, it was mentioned that the Yinghuo - 2 system they were deploying "was equipped with 10,000 A100 GPUs", approaching the performance of DGX - A100 (an AI - dedicated supercomputer launched by NVIDIA), but with a 50% cost reduction and a 40% energy consumption reduction.

In July 2023, High-Flyer announced the establishment of DeepSeek, officially entering the field of general artificial intelligence. Reportedly, DeepSeek, including founder Liang, has only 139 engineers and researchers. In contrast, OpenAI has 1,200 researchers, and Anthropic has more than 500 researchers.

On December 27, 2024, DeepSeek announced the launch and simultaneous open-sourcing of the DeepSeek - V3 model and also released 53-page training and technical details. The significantly upgraded V3 model was trained with an "unimaginable" budget: the entire training cost only $5.576 million and was completed in 55 days on a cluster of 2,048 NVIDIA H800 GPUs (a low - configuration version of GPUs for the Chinese market), which is less than one-tenth of the training cost of OpenAI's GPT - 4o model.

On January 20, 2025, DeepSeek officially released the DeepSeek - R1 model. The model's performance in tasks such as mathematics, code, and natural language inference is comparable to that of the official version of OpenAI o1. DeepSeek stated that R1 made extensive use of reinforcement learning techniques in the post-training phase, greatly improving the model's reasoning ability with only a minimal amount of labeled data. DeepSeek not only made all the R1 training techniques public but also distilled six small models and open-sourced them to the community, allowing users to train other models with them.

According to Xinhua report, on the same day of January 20, Chinese Premier Li Qiang held a symposium with experts, entrepreneurs, and representatives from fields such as education, science, culture, health, and sports, to listen to their opinions and suggestions on the annual government work report under preparation for the “Two Sessions”. Liang Wenfeng participated in the symposium and gave a speech. This shows that China's top level is very willing to listen to the opinions of young tech entrepreneurs.

A Forbes article commented that “U.S. export controls on advanced semiconductors were intended to slow China's AI progress, but they may have inadvertently spurred innovation. Unable to rely solely on the latest hardware, companies like Hangzhou-based DeepSeek have been forced to find creative solutions to do more with less. What is more, China is pursuing an open-source strategy and emerging as one of the biggest providers of powerful, fully open-source AI models in the world… DeepSeek-R1 demonstrates that China is not out of the AI race and, in fact, may yet dominate global AI development with its surprising open-source strategy. By open-sourcing competitive models, Chinese companies can increase their global influence and potentially shape international AI standards and practices. Open-source projects also attract global talent and resources to contribute to Chinese AI development. The strategy further enables China to extend its technological reach into developing countries, potentially embedding its AI systems—and by extension, its values and norms—into global digital infrastructure.”

DeepSeek and China's AI development

Discussion about this post