China’s AI sector thrives despite U.S. sanctions on advanced chips
China’s AI sector thrives despite U.S. sanctions on advanced chips
Reinforcement learning and specialized models drive progress amid chip export limitations.
Chinese artificial intelligence (AI) companies are narrowing the gap with Western leaders like OpenAI faster than industry experts had anticipated, despite U.S. government restrictions on advanced chip exports to China. These advancements have surprised observers, as Chinese researchers have managed to replicate sophisticated AI models in a relatively short period, challenging the perception that advanced chips are essential for developing cutting-edge AI technologies.
In September 2024, OpenAI introduced a new model known as o1. This "thinking model" focuses on analyzing queries thoroughly before generating answers, resulting in responses that are more accurate, insightful, and less prone to errors. The model demonstrated its capabilities in areas such as providing in-depth feedback on scientific research, impressing experts in highly specialized fields. Such advancements have been considered a key strength of the American AI industry, supported by access to world-class talent and a steady supply of advanced chips.
However, Chinese AI companies have recently claimed significant progress in developing models with capabilities comparable to o1. DeepSeek, a company backed by a prominent Chinese hedge fund, revealed a demo of a large language model (LLM) in November, asserting that its performance rivals OpenAI's “thinking model”. Similarly, Moonshot AI, supported by major players like Alibaba and Tencent, launched a model specializing in solving complex mathematical problems, reportedly achieving near parity with o1 in this domain. Alibaba also announced that its experimental AI models have outperformed OpenAI's in specific scenarios.
The performance of these models has been challenging to verify due to the lack of universally accepted benchmarks for evaluating AI capabilities. One potential metric, the American Invitational Mathematics Examination (AIME), is used to challenge and assess the mathematical abilities of advanced high school students in the United States. The three-hour exam consists of 15 questions, and only the top 2.5% of high school math test takers can take it. DeepSeek claimed that its model outperformed OpenAI’s on this exam. However, independent tests showed that OpenAI’s o1 model solved the exam questions faster than the Chinese models, though all of them answered correctly—an impressive feat given that earlier AI systems often struggled with basic arithmetic.
Chinese companies' achievements are particularly notable because they lack access to the most advanced AI chips. Since October 2022, the U.S. government has imposed strict restrictions on exporting high-performance chips to China to limit its ability to develop advanced AI systems. These restrictions have been tightened over time, including curbs on American investment in Chinese AI firms and further limitations on memory chip exports. The aim has been to deny Chinese companies the resources needed to compete in AI development, thus securing an advantage for American firms.
Despite these obstacles, Chinese companies have adapted by developing alternative training methods that require fewer computational resources. Some companies are focusing on reinforcement learning, a method that improves performance through trial and error, reducing the need for intensive computing power. Others are adopting a "mixture of experts" (MoE) approach, which uses specialized sub-models for specific tasks, optimizing efficiency while minimizing resource demands.
Tencent, for instance, has reported that its MoE model, launched in November, achieves performance comparable to Meta’s Llama 3.1 model, which debuted in July. Research indicates that Tencent’s model was trained using a fraction of the computational power required by Meta’s model. Meanwhile, companies like DeepSeek have found innovative ways to maximize the performance of less advanced chips. By constructing efficient hardware and software systems, DeepSeek has achieved results comparable to those of larger and more power-hungry setups.
While these innovations have allowed Chinese companies to make strides, they still face challenges as global competitors begin deploying next-generation AI systems powered by advanced chips. In 2025, new computing systems featuring cutting-edge chips are expected to be operational. For example, xAI, a company founded by Elon Musk, is building a massive data center with 100,000 Nvidia Blackwell chips and plans to construct additional facilities using funds from its recent $6 billion fundraising round. Similarly, Amazon is working on a supercomputer powered by hundreds of thousands of proprietary chips.
These advancements create uncertainty for Chinese companies, impacting their fundraising and valuations. For instance, Zhipu AI, a notable player in the Chinese AI sector, recently completed a funding round at a $3 billion valuation—a significantly smaller amount compared to its U.S. counterparts. The company has also delayed plans for an initial public offering in 2025 due to concerns about achieving its desired valuation.
Despite these challenges, Chinese AI companies have shown remarkable resilience. By focusing on innovative approaches and efficient resource utilization, they continue to progress in an increasingly competitive landscape.