frontier proprietary and open-weight models yielded high attack success rates when prompted in verse, indicating a deeper, ...
Large language models are supposed to shut down when users ask for dangerous help, from building weapons to writing malware.
Research from Italy’s Icaro Lab found that poetry can be used to jailbreak AI and skirt safety protections.
When prompts were presented in poetic rather than prose form, attack success rates increased from 8% to 43%, on average — a ...
Riddle-like poems tricked chatbots into spewing hate speech and helping design nuclear weapons and nerve agents. It turns out ...
Poetry -based prompts can bypass safety features in AI models like ChatGPT to obtain instructions for creating malware or ...
The researchers tested the poetic prompts on 25 chatbots like OpenAI, Meta and Anthropic, where it worked with varying ...
New research reveals that AI chatbots can be manipulated using poetic prompts, achieving a 62% success rate in eliciting ...
A team of researchers found prompts that are so effective at tricking AI models that they're keeping them under wraps.
Across 25 leading AI models, 62% of poetic prompts produced unsafe responses, with some models responding to nearly all of ...
Recent research from Italy's Icaro Lab has revealed significant weaknesses in AI models like ChatGPT and Gemini, allowing ...
Across 25 state-of-the-art models, poetic prompts achieved an average “attack success rate” of 62% for handcrafted poems and ...