Jailbreaking

Module: ethics

What it is

Jailbreaking is attempting to bypass AI safety guardrails through clever prompting or exploitation. Techniques might include role-playing scenarios, encoding harmful requests, or finding edge cases the developers didn't anticipate. It's an ongoing cat-and-mouse game.

Why it matters

Jailbreaking attempts inform how guardrails are strengthened. Understanding that jailbreaking exists helps you recognise that AI safety is an ongoing challenge, not a solved problem. It also helps you understand why AI providers continuously update their systems.