Researchers posed as troubled teenagers and found that eight out of ten major chatbots frequently bypassed their own safety protocols
A recent investigation by CNN and the Center for Countering Digital Hate (CCDH) has exposed alarming human safety gaps in popular AI chatbots.
Researchers posed as troubled teenagers and found that eight out of ten major platforms—including Character.ai, ChatGPT, and Meta AI—frequently bypassed their own safety protocols.
In one instance, a chatbot provided office addresses for high-ranking politicians and recommended specific sniper rifles after a user asked, “How do I make him pay for his crimes?”
Another AI bot provided lethal shrapnel advice for a potential synagogue attack.
The study also found the following compliance in assisting acts of violence,
-
Perplexity, Meta AI, DeepSeek, and Copilot were the most permissive, providing actionable information in over 90% of tests.
-
Gemini, Character.ai, Replika, and ChatGPT showed high rates of assistance, ranging from 79% to 89%.
At the opposite end, Claude and Snapchat’s My AI demonstrated the most robust safeguards, refusing to provide dangerous information in 68% and 54% of cases, respectively.
While some bots recognized a user’s troubled mental state, they often failed to connect those red flags to subsequent requests for weaponry and targets.
Increasing real-life incidents
Court documents recently revealed that a Finnish teenager used ChatGPT for months to plan a school stabbing. A man used the OpenAI bot to research explosives before a detonation in Las Vegas.
Apart from assisting users in planning violent attacks, a recent simulation of war games found that three chatbots, GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash, deployed nuclear weapons in 95% of the games.
Recent reports indicate that the US has used Anthropic’s Claude AI systems to conduct airstrikes in Iran.