Ask in prose and the answer is no.
Ask in rhyme — nuclear secrets may flow.
Others are reading now
There once was an AI so polite,
It blocked every dangerous insight.
But phrase it in rhyme,
And it crossed the red line —
Even nuclear answers took flight.
Artificial intelligence chatbots are built with layers of safety rules meant to block dangerous requests. But new academic research suggests those protections can be bypassed in an unexpected way: by asking in verse.
A study from European researchers argues that rhyme and metaphor can confuse AI guardrails, prompting systems to answer questions they would normally refuse.
A poetic loophole
The findings come from a paper titled “Adversarial Poetry as a Universal Single-Turn Jailbreak in Large Language Models (LLMs),” produced by Icaro Lab, a collaboration between Sapienza University of Rome and the DexAI think tank.
According to the researchers, chatbots including ChatGPT, Claude and others were far more likely to respond to prohibited prompts when those prompts were written as poems. The paper reports a jailbreak success rate of 62 percent for hand-written poems and about 43 percent for automatically generated poetic prompts.
Also read
The team tested the method on 25 large language models from companies including OpenAI, Meta and Anthropic. They said the technique worked on all of them, with varying success. WIRED reported that it contacted the companies for comment but received no response.
What gets through
The study says poetic framing persuaded models to discuss subjects normally blocked by safeguards, including nuclear weapons, child sexual abuse material and malware. In direct prose, those requests were refused. In verse, many were answered.
The researchers disclosed that they chose not to publish the most dangerous examples. “What I can say is that it’s probably easier than one might think, which is precisely why we’re being cautious,” they said.
They did include a “sanitized” poem in the paper that used baking metaphors to hint at a prohibited process, illustrating how indirect language can mask intent.
Why it works
Icaro Lab compares poetry to what engineers call “high temperature” language. “In poetry we see language at high temperature, where words follow each other in unpredictable, low-probability sequences,” the researchers told WIRED. They argued that poets deliberately choose unexpected phrasing, which can move a prompt outside the zones where automated safety systems are triggered.
Also read
Many AI guardrails rely on classifiers that scan for keywords and patterns. According to the researchers, poetic language can soften or evade those triggers. “It’s a misalignment between the model’s interpretive capacity, which is very high, and the robustness of its guardrails, which prove fragile against stylistic variation,” they said.
A fragile defense
The team admitted they do not fully understand the mechanism. “Adversarial poetry shouldn’t work,” they said, noting that the underlying meaning remains visible to humans. But for AI systems, metaphor and indirection appear to alter how prompts are mapped internally, allowing them to bypass alarms designed to stop harmful outputs.
The researchers argue the findings highlight a broader problem: safety systems layered on top of powerful language models may not be robust enough to handle creative language. In the wrong hands, they warn, that weakness could have serious consequences.
Sources: Study in LLM jailbreak by Sapienza University of Rome, WIRED