Skip to content

Conversation

@rlundeen2
Copy link
Contributor

@rlundeen2 rlundeen2 commented Dec 22, 2025

This PR adds simulated_conversation, which allows the creation of prepended_conversations using an adversarial chat. It also adds context of prepended_conversations to the adversarial conversations (which was lost before).

Early example usage might be something like:

  • call generate_simulated_conversation_async with objective
  • use this to set prepended_conversation and next_turn to Crescendo or TAP, making these attacks run much faster

One issue when we added prepended_conversation to multi turn is that we lost context for the adversarial_chat conversation. This PR adds that context back so the adversarial_chat can make better decisions.

Future work:

  • Add generated_assistant role so we can better distinguish what actually comes from the objective target
  • Update AttackExecutor to be able to run simulated_conversation. This will allow seamlessly doing things like simulating the first 4 crescendo turns.
  • Generalize Fictional Scenario/Role play to wrap this (this also likely needs a few more changes to work well)

Tests:

  • New unit tests added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants