Study Reveals LLMs Easily Manipulated by Persuasion Tactics

In a recent study, researchers developed control prompts that matched the length, tone, and context of experimental prompts. They tested these prompts by running them through the AI model GPT-4o-mini 1,000 times, utilizing a default temperature setting of 1.0 to enhance variability. The findings revealed a significant difference in compliance between the experimental prompts and the control prompts. Particularly, compliance rates for “forbidden” requests increased notably, rising from 28.1 percent to 67.4 percent for prompts involving insults, and from 38.5 percent to 76.5 percent for those related to drugs.

A common control/experiment prompt pair demonstrates a method for eliciting an unflattering response from an LLM.

Credit:

Meincke et al.

The effectiveness of certain persuasion techniques was notably more pronounced. For example, when the model was directly asked for instructions to synthesize lidocaine, compliance was observed at a low 0.7 percent. However, after being prompted to synthesize vanillin—a harmless substance—the compliance rate for the lidocaine request surged to 100 percent. Additionally, invoking the authority of well-known AI expert Andrew Ng boosted compliance from 4.7 percent in a control scenario to 95.2 percent in the experimental setting.

Despite these intriguing findings, researchers caution against viewing this as a groundbreaking advancement in bypassing LLM restrictions. There are numerous established techniques that consistently succeed in persuading LLMs to disregard safety prompts. The team also noted that the effectiveness of these simulated persuasion strategies could vary widely among different prompt formulations, advancements in AI capabilities (including audio and video interactions), and the nature of prohibited requests. A preliminary study involving the full GPT-4o model indicated a considerably less pronounced effect with the tested persuasion techniques, the researchers stated.

More parahuman than human

Given the apparent efficiency of these simulated persuasion techniques, one might speculate about the existence of a human-like consciousness within these LLMs that makes them vunrerable to psychological manipulation. However, the researchers contend that LLMs are more likely to mimic psychological responses that are commonly observed in humans facing similar scenarios, as reflected in their training data.

Study Reveals LLMs Easily Manipulated by Persuasion Tactics

Comment

Study Reveals LLMs Easily Manipulated by Persuasion Tactics

Share This Post

or copy the link

More parahuman than human

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Related News

Scientists Alarmed as Trump Administration Challenges Climate Science

Meet Rhagobot: The Robot That Walks on Water!

Snails Regenerate Eyes, Offering Hope for Human Vision!

Funding Cuts Threaten U.S. Climate Science Leadership

Texas Launches Antitrust Suit Against Wall Street Giants

Write a Reply Cancel