Background
We currently train with a private DVC tracked dataset. We want to keep that as the default. In addition, we want to offer a reproducible public path that uses our pyro-sdis dataset on Hugging Face
Goal
Enable users to run a full training job in pyro train using the pyro sdis dataset, while keeping the current DVC workflow as the default