Deception | Wadhwani School of Data Science and Artificial Intelligence

Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation

Publications

We explore the ability of large language models (LLMs) to engage in subtle deception through strategically phrasing and intentionally manipulating information. This harmful behavior can be hard to detect, unlike blatant …

Tags: LLM Safety, Deception, Legislation