There should be scripts to download a configurable number of zips containing labelled data. In the grand pipeline, this will be used by whoever is going to be training the model, and should be mindless - so it should check which zips are already in a location and download any new ones from the drive.