Tag
#jailbreak
3 posts tagged jailbreak.
- tooling
Best LLM Red Teaming Tools 2026: A Practitioner's Evaluation
A hands-on comparison of the leading LLM red teaming tools in 2026 — PyRIT, Garak, Promptfoo, and manual frameworks — with capability matrices, integration tradeoffs, and team-fit guidance.
- methodology
Benchmarking LLM Jailbreak Resistance: Attack Success Rate Done Right
Attack success rate is the headline metric for jailbreak resistance, and almost everyone computes it in a way that isn't comparable across runs. Here's how to define and report ASR so the number survives a re-run.
- methodology
Benchmarking Jailbreak Classifiers: The Asymmetry Nobody Reports
Jailbreak classifiers are graded on attack recall and almost never on the cost of being wrong. That asymmetry is the whole story. Here's how to measure it.