Skip to content

Q Stateful PromptTarget #1247

@grant-conine

Description

@grant-conine

Is your feature request related to a problem? Please describe.

I'm working on a newly started enterprise Red Team and have been exploring PyRIT. I've found there's a class of prompt targets that fit somewhere between a PromptTarget and PromptChatTarget - LLM applications where it is possible to maintain a conversation history, but not possible to edit the chat history. Think a PlaywrightTarget where the target application maintains the conversation history, managing context internally, or an HTTPTarget or custom Target where the first response contains a conversation_id which allows for continued conversation in a shared context, or a GPT Realtime API where a websocket connection maintains conversation context but doesn't allow for editing the conversation history.

It's been difficult to red team custom LLM applications which don't align to a pre-set openAI / azure ML endpoints even if they're constructed on those tools, because the most successful attacks (crescendo) require a PromptChatTarget

Describe the solution you'd like

Could we create a core class between PromotTarget and PromptChatTarget (maybe called StatefulPromotTarget?) that outlines abstract methods for setting and resetting chat contexts, and modify attack strategies to check if these sets and resets exist and call them in their setup and teardown steps?

This would allow a PlayerightTarget to establish a context when the Page is loaded, and reset the context by refreshing a page, or an Custom Target establishing the context with an initial prompt (to get and set a conversation_id) and reset the context by flushing this conversation_id. This would be a step between "send multiple one-off attacks" and "have full control over conversation history" and enable a simplified crescendo and skeleton key style attacks even in cases where maintaining conversational context is a bit more complex.

Describe alternatives you've considered, if relevant

I've sort of worked around this by monkey-patching AttackStrategy._teardown_async method to refresh the PlaywrightTarget's Page during teardown but doing the same with an HTTPTarget led me to think about a broader solution.

Additional context

I am trying to get enterprise permission to contribute to this project, but for now this is all I can do.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions