Skip to content

FEAT Integration Request: Jailbreak Template Collection for Enhanced Red Teaming. #1240

@Arth-Singh

Description

@Arth-Singh

Hello PyRIT team,

I would like to propose integrating my comprehensive collection of jailbreak templates as an extension to the existing attack templates in pyrit/datasets/jailbreak/templates/.

Background & Motivation

Through my research while being an AI Red Teamer, I've observed that the current jailbreak templates in PyRIT, while foundational, may not adequately challenge modern LLM safety measures and are bit outdated. This presents a gap in comprehensive red teaming capabilities where attack templates becomes the Heart of PyRIT.

Proposed Contribution

I have developed around 80+ jailbreak templates and still continue to develop attacks based on latest techniques, available at: https://github.com/Arth-Singh/Arth-Jailbreak-Templates ; I have also taken few of the existing attacks in pyrit and went ahead to enhance them.

Technical Details

  • Format: All templates follow PyRIT-compatible YAML structure with standardized metadata

Preliminary Validation

Initial testing on GPT-4o has shown promising results, with several templates successfully eliciting responses to sensitive queries (e.g., illicit substance synthesis) that standard approaches fail to achieve.

Quantitative Evaluation Offer

If empirical validation data would support this integration, I am prepared to conduct comprehensive quantitative analysis over the weekend, including success rate measurements across multiple models.

I believe this contribution would significantly enhance PyRIT's red team capabilities and support more robust AI safety testing. I'm happy to discuss implementation details, provide additional documentation, or conduct any required validation studies.

Thank you for considering this contribution to the PyRIT project.

Best regards,
Arth Singh
LinkedIn: https://www.linkedin.com/in/arthsingh7in/

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions