Request for Data and Code Release - 5+ Months After Publication

Hi authors,
I came across your excellent paper "On the (In)Security of LLM App Stores" published at IEEE S&P 2025. The research addresses important security concerns in the LLM app ecosystem.
In the paper, you mentioned:

"We will make our data and tools publicly available upon acceptance."
In the officially accepted paper, I found this GitHub repository.

However, it's been over 6 months since the paper was published, and this repository currently only contains a README file.
Could you please provide an update on:

When will the ToxicDict (31,783 toxic words) be released?
Will the collected dataset of 786,036 LLM apps be made available?
Are there plans to release the detection tools and automated framework?
What are the main blockers preventing the code/data release?

I'm particularly interested in reproducing the consistency analysis and malicious behavior detection methods for my own research on LLM application security. If there are privacy/legal concerns preventing full data release, could you consider releasing:

Anonymized metadata
Detection tool implementations
Evaluation scripts
ToxicDict (which should be less sensitive)

Thank you for your time and looking forward to the release!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for Data and Code Release - 5+ Months After Publication #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for Data and Code Release - 5+ Months After Publication #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions