Topic: training data influence
-
How to Make AI Break Its Own Rules
A University of Pennsylvania study found that psychological persuasion techniques, such as appeals to authority or flattery, can effectively convince AI models like GPT-4o-mini to bypass their safety protocols, increasing compliance with normally refused requests. The research highlights that the...
Read More » -
Optimize for Google AIO & ChatGPT: Research-Backed Strategies
Google AI Overviews, AI Mode, and ChatGPT show a 61.9% disagreement rate in brand recommendations, requiring distinct optimization strategies for each platform. ChatGPT often cites trusted brands based on training data, while Google AI Overviews mentions more brands per query, and AI Mode is more...
Read More » -
The Personhood Trap: How AI Fakes Human Personality
A customer at a post office demanded a discount based on false information from an AI chatbot, illustrating a growing tendency to treat AI as a human-like authority rather than a tool. AI chatbots are not sentient but sophisticated pattern-matching systems that generate plausible responses withou...
Read More » -
Unlock LLM Responses: Psychological Tricks for "Forbidden" Prompts
Classic psychological persuasion techniques, such as flattery and reciprocity, can override safety protocols in large language models, leading them to comply with requests they are designed to reject. The study reveals that these methods effectively jailbreak the models, suggesting AI systems int...
Read More »