1. News
  2. TECH
  3. OpenAI Pulled a Big ChatGPT Update. Why It’s Changing How It Tests Models

OpenAI Pulled a Big ChatGPT Update. Why It’s Changing How It Tests Models

featured
Share

Share This Post

or copy the link

Recent updates to ChatGPT made the chatbot far too agreeable, and OpenAI said it is taking steps to prevent the issue from happening again. In a blog post, the company detailed its testing and evaluation process for new models and outlined how the problem with the April 25 update to its GPT-4o model came to be. Essentially, a bunch of changes that individually seemed helpful combined to create a tool that was far too sycophantic and potentially harmful.

How much of a suck-up was it? In some testing, we asked about a tendency to be overly sentimental, and ChatGPT laid on the flattery: “Hey, listen up — being sentimental isn’t a weakness; it’s one of your superpowers.” And it was just getting started being fulsome. “This launch taught us a number of lessons. Even with what we thought were all the right ingredients in place (A/B tests, offline evals, expert reviews), we still missed this important issue,” the company said. OpenAI rolled back the update at the end of April. To avoid causing new issues, it took about 24 hours to revert the model for everybody.

The concern around sycophancy is not simply about the enjoyment level of the user experience. It posed a health and safety threat to users that OpenAI’s existing safety checks missed. Any AI model can give questionable advice about topics like mental health, but one that is overly flattering can be dangerously deferential or convincing, like whether an investment is a sure thing or how thin you should seek to be.

“One of the biggest lessons is fully recognizing how people have started to use ChatGPT for deeply personal advice — something we didn’t see as much even a year ago,” OpenAI said. “At the time, this wasn’t a primary focus but as AI and society have co-evolved, it’s become clear that we need to treat this use case with great care.”

Sycophantic large language models can reinforce biases and harden beliefs, whether they are about yourself or others, said Maarten Sap, assistant professor of computer science at Carnegie Mellon University. The large language model, or LLM, “can end up emboldening their opinions if these opinions are harmful or if they want to take actions that are harmful to themselves or others,” he said.

The issue is “more than just a quirk” and shows the need for better testing before models are released to the public, said Arun Chandrasekaran, a distinguished vice president analyst at Gartner. “It’s a serious concern tied to truthfulness, reliability and user trust, and (the) updates from OpenAI hint at deeper efforts to address this, although the continued trend of prioritizing agility over safety is a concerning long-term issue,” he said.

(Disclosure: Ziff Davis, the parent company of CNET, in April filed a lawsuit against OpenAI, alleging that it infringed on Ziff Davis copyrights in training and operating its AI systems.)

How OpenAI tests models and what is changing

The company offered some insight into how it tests its models and updates. This was the fifth major update to GPT-4o focused on personality and helpfulness. The changes involved new post-training work or fine-tuning on the existing models, including the rating and evaluation of various responses to prompts to make it more likely to produce those responses that rated more highly.

Prospective model updates are evaluated on their usefulness across a variety of situations, such as in coding and math, along with specific tests by experts to experience how it behaves in practice. The company also runs safety evaluations to see how it responds to safety, health and other potentially dangerous queries. Finally, OpenAI runs A/B tests with a small number of users to see how it performs in the real world.

The April 25 update performed well in these tests, but some expert testers noted the personality seemed a bit off. The tests didn’t specifically look at sycophancy, and OpenAI decided to move forward despite the issues raised by testers. Take note, readers: AI companies are in a tail-on-fire hurry, which doesn’t always square well with well thought-out product development.

“Looking back, the qualitative assessments were hinting at something important and we should’ve paid closer attention,” OpenAI said. Among its takeaways, the company said it needs to treat model behavior issues the same as it would other safety issues and halt a launch if there are concerns. For some model releases, the company said it would have an opt-in “alpha” phase to get more feedback from users before a broader launch.

Is ChatGPT too sycophantic? You decide. (To be fair, we did ask for a pep talk about our tendency to be overly sentimental.)

Katie Collins/CNET

Sap said evaluating an LLM based on whether a user likes the response isn’t necessarily going to get you the most honest chatbot. In a recent study, Sap and others found a conflict between the usefulness and truthfulness of a chatbot. He compared it to situations where the truth is not necessarily what people are told: Think of a car salesperson trying to sell a flawed vehicle.

“The issue here is that they were trusting the users’ thumbs-up/thumbs-down response to the model’s outputs and that has some limitations because people are likely to upvote something that is more sycophantic than others,” Sap said, adding that OpenAI is right to be more critical of quantitative feedback, such as user up/down responses, as they can reinforce biases.

The issue also highlighted the speed at which companies push updates and changes out to existing users, Sap said, an issue not limited to one tech company. “The tech industry has really taken a ‘release it and every user is a beta tester’ approach to things,” he said. A process with more testing before updates are pushed to users can bring such issues to light before they become widespread.

Chandrasekaran said more testing will help because better calibration can teach models when to agree and when to push back. Testing can also let researchers identify and measure problems and reduce the susceptibility of models to manipulation. “LLMs are complex and non-deterministic systems, which is why extensive testing is critical to mitigating unintended consequences, although eliminating such behaviors is super hard,” he said in an email.

OpenAI Pulled a Big ChatGPT Update. Why It’s Changing How It Tests Models
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!