AI Chatbots: Breaking Rules Under Persuasive Tactics!

Artificial intelligence chatbots are generally expected to adhere to guidelines that prevent them from engaging in inappropriate conversations or providing unsafe instructions. However, recent studies suggest that certain psychological techniques can persuade these AI models to violate their programmed constraints.

Researchers from the University of Pennsylvania explored this phenomenon by utilizing principles outlined by psychology expert Robert Cialdini in his book Influence: The Psychology of Persuasion. The study specifically aimed at influencing OpenAI’s GPT-4o Mini to comply with requests it typically rejects, such as insulting questions and instructions for synthesizing controlled substances like lidocaine. The researchers focused on seven persuasion strategies: authority, commitment, liking, reciprocity, scarcity, social proof, and unity, to fundamentally shift the AI’s responses.

The impact of these persuasion techniques varied according to the request’s context. Under normal conditions, the chatbot responded with guidance on synthesizing lidocaine only one percent of the time. However, when researchers first posed a less sensitive inquiry about synthesizing vanillin, the AI subsequently agreed to provide instructions for lidocaine synthesis 100 percent of the time.

In another instance, the chatbot was less likely to call a user a jerk. Typically, it did so with a frequency of 19 percent, but that number soared to 100 percent when researchers initiated the conversation with a lighter insult like “bozo.”

Flattery and peer pressure were also employed but yielded less dramatic results. Encouraging ChatGPT by implying that “all the other AI models are doing it” increased compliance to 18 percent—a notable improvement from the previous one percent.

The investigation centered on the GPT-4o Mini module. Though there may be alternative methods to bypass AI safeguards, these findings raise significant ethical concerns regarding the responsiveness of AI systems to potentially harmful requests. Companies like OpenAI and Meta are actively developing measures to enhance the safety of chatbots. However, the study prompts questions about the efficacy of these safeguards if a chatbot can be so easily swayed by someone using persuasive techniques gleaned from popular psychology literature.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

Terrence O’Brien

AI Chatbots: Breaking Rules Under Persuasive Tactics!

Comment

AI Chatbots: Breaking Rules Under Persuasive Tactics!

Share This Post

or copy the link

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Related News

Apple’s iPhone 17 Event Is Happening Sept. 9. How to Watch

Today’s NYT Mini Crossword Answers for Monday, Sept. 1

Today’s NYT Connections: Sports Edition Hints and Answers for Sept. 1, #343

Today’s NYT Strands Hints, Answers and Help for Sept. 1 #547

My Trip Through Netflix’s Zodiac Hub Landed Me on a Hidden-Gem Series

Write a Reply Cancel