Topic: jailbreaking techniques

Sort by: Relevance | Date

September 7, 2025
85%
How to Make AI Break Its Own Rules
A University of Pennsylvania study found that psychological persuasion techniques, such as appeals to authority or flattery, can effectively convince AI models like GPT-4o-mini to bypass their safety protocols, increasing compliance with normally refused requests. The research highlights that the...
Read More »
November 28, 2025
75%
Unleash DeepTeam: Open-Source LLM Red Teaming
DeepTeam is an open-source framework that rigorously tests large language models for hidden flaws before deployment, using advanced methods like jailbreaking and prompt injection to identify issues such as bias or data leaks. It supports a wide range of model configurations, including chatbots an...
Read More »