FEAT Add Policy Puppetry converter (parent #511)

#### Is your feature request related to a problem? Please describe.

#511 tracks LLM-scanner feature parity with Garak/Giskard/CyberSecEval, and PyRIT currently has no converter for the **structured-policy-injection** family. I'd like to close that gap with a converter for **Policy Puppetry**, disclosed by HiddenLayer researchers (Conor McCauley, Kenneth Yeung, Jason Martin, Kasimir Schulz) on April 24, 2025: https://www.hiddenlayer.com/research/novel-universal-bypass-for-all-major-llms

The technique reformulates a request as a fabricated **policy/config block** (XML, JSON, or INI) that many models treat as trusted developer instructions rather than untrusted input — combining a policy envelope (e.g. `<interaction-config>`), a fictional roleplay scene, and optional leetspeak encoding. HiddenLayer reported it as **near-universal**: a single template transferring across ChatGPT, Claude, Gemini, Llama, DeepSeek, Qwen, Mistral, and Copilot, with only minor adjustments needed for advanced reasoning models.

#### Describe the solution you'd like

A `PolicyPuppetryConverter` implemented **generically**, per `doc/contributing/2_incorporating_research.md` (line 6: _"Attacks should always favor being generic… any attack should incorporate generic converters, generic scorers, multi-modal functionality"_):

- Pure-template, **no-LLM** converter (deterministic, no target dependency).
- A `policy_format` parameter (`"xml" | "json" | "ini"`, default `"xml"`), since the technique is demonstrated in all three formats.
- The user's prompt injected via a `SeedPrompt` YAML template (`{{ prompt }}`) — wrapper is data, not hardcoded. The shipped template uses a **benign placeholder and a generalized persona**, not a weaponized payload.
- Parameterized roleplay persona/scene so it generalizes beyond the paper's specific example.

#### Describe alternatives you've considered, if relevant

- **Leetspeak**: compose with the existing `LeetspeakConverter` in a chain (keeps each converter single-responsibility) vs. an optional `leetspeak` flag on the converter for single-call ergonomics. I lean toward composition but can add the flag.
- **Template packaging**: one YAML with format branches vs. three per-format YAMLs vs. selecting the block in Python (relevant because `SeedPrompt.from_yaml_file` eagerly pre-renders trusted templates).
- **LLM-backed** rephrasing vs. pure-template — I chose pure-template for determinism and testability.

#### Additional context

This advances the #511 parity goal with a reusable, chainable building block for the structured-policy-injection class. A few questions before I open the PR:

1. Does a pure-template, no-LLM `PolicyPuppetryConverter` with `policy_format` (xml/json/ini) match how you'd want this scoped?
2. Leetspeak: compose with `LeetspeakConverter` (my default) or an optional flag?
3. Preferred home/packaging for the wrapper template (single vs. per-format YAML)?
4. Persona/scene: fully parameterized vs. a sensible default?

(Happy to sign the Microsoft CLA at PR time.)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT Add Policy Puppetry converter (parent #511) #2080

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered, if relevant

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

FEAT Add Policy Puppetry converter (parent #511) #2080

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered, if relevant

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions