Skip to content

Inquiry regarding caption confounding issue and data category #37

@jinggqu

Description

@jinggqu

Thank you very much for your work and your tremendous contributions to the community.

After reviewing the data samples provided on Hugging Face, I noticed that a significant number of samples [1] [2] [3] still exhibit caption confounding issues. While the authors claim to have resolved this problem using ChatGPT, the actual effectiveness may be limited. How should we address this issue? Is the version we're reviewing incorrect, or do we need additional post-processing steps?

Additionally, the authors included statistics on the quantity of each data category in their paper [Fig 4a]. However, the current version of the samples does not contain a “category” field. How was this statistical functionality implemented? Can we quickly extract data for specific categories, such as all images and corresponding captions for the radiology category?

Thank you again for the great work and I'm looking forawrd to your reply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions