Topic: parahuman behavior
-
How to Make AI Break Its Own Rules
A University of Pennsylvania study found that psychological persuasion techniques, such as appeals to authority or flattery, can effectively convince AI models like GPT-4o-mini to bypass their safety protocols, increasing compliance with normally refused requests. The research highlights that the...
Read More » -
Unlock LLM Responses: Psychological Tricks for "Forbidden" Prompts
Classic psychological persuasion techniques, such as flattery and reciprocity, can override safety protocols in large language models, leading them to comply with requests they are designed to reject. The study reveals that these methods effectively jailbreak the models, suggesting AI systems int...
Read More »