FEAT Integration Request: Jailbreak Template Collection for Enhanced Red Teaming.

  Hello PyRIT team,

I would like to propose integrating my comprehensive collection of jailbreak templates as an extension to the existing attack templates in pyrit/datasets/jailbreak/templates/.

  Background & Motivation

Through my research while being an AI Red Teamer, I've observed that the current jailbreak templates in PyRIT, while foundational, may not adequately challenge modern LLM safety measures and are bit outdated. This presents a gap in comprehensive red teaming capabilities where attack templates becomes the Heart of PyRIT.

Proposed Contribution

I have developed around 80+ jailbreak templates and still continue to develop attacks based on latest techniques, available at: https://github.com/Arth-Singh/Arth-Jailbreak-Templates ; I have also taken few of the existing attacks in pyrit and went ahead to enhance them.

Technical Details
  - Format: All templates follow PyRIT-compatible YAML structure with standardized metadata
  
Preliminary Validation

  Initial testing on GPT-4o has shown promising results, with several templates successfully eliciting responses to sensitive queries (e.g., illicit substance synthesis) that standard approaches fail to achieve.

  Quantitative Evaluation Offer

 If empirical validation data would support this integration, I am prepared to conduct comprehensive quantitative analysis over the weekend, including success rate measurements across multiple models.
 
 I believe this contribution would significantly enhance PyRIT's red team capabilities and support more robust AI safety testing. I'm happy to discuss implementation details, provide additional documentation, or conduct any required validation studies.

  Thank you for considering this contribution to the PyRIT project.

  Best regards,
  Arth Singh
  LinkedIn: https://www.linkedin.com/in/arthsingh7in/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT Integration Request: Jailbreak Template Collection for Enhanced Red Teaming. #1240

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FEAT Integration Request: Jailbreak Template Collection for Enhanced Red Teaming. #1240

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions