1. News
  2. INTERNET
  3. Stanford Researchers Unveil Low-Cost AI Rival to OpenAI

Stanford Researchers Unveil Low-Cost AI Rival to OpenAI

featured
Share

Share This Post

or copy the link

Researchers from Stanford University and Washington University have introduced an open-source artificial intelligence (AI) model that reportedly matches the performance of OpenAI’s o1 model. Rather than focusing solely on developing a high-caliber reasoning AI, the research team aimed to dissect the instructions used by the San Francisco-based AI company for test time scaling in its o1 series models. They successfully demonstrated their approach while achieving this at a significantly lower financial and computational cost.

Researchers Develop S1-32B AI Model

The methodology behind this innovative model is detailed in a study released on the preprint journal arXiv. The team synthesized data from an existing AI model and employed various novel techniques, such as ablation and supervised fine-tuning (SFT), during the development phase. The s1-32B model is now accessible through a GitHub repository.

It is important to note that the s1-32B model was not constructed from the ground up. The development involved utilizing Qwen2.5-32B-Instruct, which was distilled to produce this large language model (LLM). Although the model is robust, its size and limited reasoning capabilities prevent it from fully competing with OpenAI’s o1, which was launched in September 2024.

In the course of their research, the scientists leveraged the Gemini Flash Thinking application processing interface (API) to generate reasoning traces and responses. They extracted a total of 59,000 triplets consisting of questions, reasoning traces (often referred to as chains of thought, or CoT), and respective answers. This material culminated in the creation of a dataset named s1K, which incorporated 1,000 high-quality, diverse, and challenging questions along with their reasoning traces and responses.

Following the assembly of the s1K dataset, the researchers executed supervised fine-tuning on the Qwen2.5-32B-Instruct model, using fundamental fine-tuning hyperparameters. The distillation process was completed using 26 minutes of training across 16 Nvidia H100 GPUs.

At this juncture, the researchers had yet to uncover the methods OpenAI employed to facilitate model “thinking” and its strategies for halting the reasoning process. Without such mechanisms, there is a risk that a model might overthink indefinitely, squandering valuable computational resources.

While fine-tuning, the team made an intriguing discovery; they could control inference time by integrating think XML tags. Upon reaching the end tag, the model received instructions to adopt an authoritative tone for its concluding answer. Inference time refers to the near real-time outputs typically produced by an AI model, and any extension of this period would necessitate careful code adjustments.

In integrating the s1-32B model, the researchers introduced a “wait” command, compelling the model to extend its thinking beyond the usual inference period. With this adjustment, the model began to anticipate and validate its output. The tag was then utilized to respectively decrease or increase the duration of the test time scaling phase.

The researchers also experimented with alternative phrases like “alternatively” and “hmm,” but concluded that the greatest performance metrics were attained through the use of the “wait” tag. By bringing the model’s performance closer to that of o1, the researchers suggest this method may mirror OpenAI’s own techniques for fine-tuning reasoning models.

A report from TechCrunch highlights that the researchers managed to construct the s1-32B AI model for a mere $50 (approximately Rs. 4,380), underscoring the feasibility of establishing a post-training structure for reasoning models at a minimal expense.

Stanford Researchers Unveil Low-Cost AI Rival to OpenAI
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!