OpenAI has alleged that the Chinese firm DeepSeek may have distilled its artificial intelligence (AI) models to develop the DeepSeek-R1 model. Reports indicate that the San Francisco-based company claims to have evidence that certain users utilized the outputs of its AI models for a competing product, which is believed to be DeepSeek. Recently, DeepSeek made headlines by releasing its open-source DeepSeek-R1 AI model on platforms like GitHub and Hugging Face, achieving results that reportedly exceeded those of OpenAI’s o1 AI models across several benchmarks.
OpenAI Alleges Misuse of Its Models
The Financial Times has cited OpenAI’s assertion that its proprietary AI models were employed in training DeepSeek’s models. According to the company, it has seen indications of distillation from various accounts using the OpenAI application programming interface (API). Following an investigation in collaboration with its cloud partner Microsoft, OpenAI blocked these accounts’ access to its services.
In comments made to the Financial Times, OpenAI stated, “We know [China]-based companies — and others — are constantly trying to distil the models of leading US AI companies.” The firm also emphasized its ongoing efforts to work with the US government to safeguard its advanced models against competitors and potential threats.
AI model distillation is a process that enables knowledge transfer from a larger model to a more compact and efficient one. The objective is to create a smaller model that can perform similarly to, or even surpass, the larger model, while also reducing computational demands. For context, OpenAI’s GPT-4 boasts approximately 1.8 trillion parameters, while the DeepSeek-R1 contains around 1.5 billion parameters, aligning with this methodology.
This knowledge transfer typically involves utilizing datasets from the larger model to train the smaller one during in-house development of more efficient versions. For example, Meta employed its Llama 3 AI model to develop several coding-focused Llama models.
Nonetheless, when a competitor lacks access to the datasets that underpin a proprietary model, distillation becomes challenging. If OpenAI’s claims are accurate, such distillation could have been attempted through prompt injections into its APIs, generating extensive outputs. This natural language data could then be converted into code and utilized to train a base model.
While OpenAI has not made a formal public statement regarding these allegations, CEO Sam Altman has recently commended DeepSeek for its advancements in AI modeling, acknowledging the increased competitive landscape within the AI sector.