-
Notifications
You must be signed in to change notification settings - Fork 17
Improvements to archives handling #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
gbrls
commented
Oct 26, 2023
- Added infer as a dependency so the inference of the archive type (zip, gzip) doesn't rely on the file extension.
- Added option to extract the archive to a specific directory.
- The raw archive is deleted after it's extracted. (This may cause problems, in my testings the archive are downloaded again if they're deleted)
- added infer as dependency to check for zip and gz archives. - added option to extract to a configurable directory. - after an archive is extracted, the compressed file is deleted.
This reverts commit 31c43b8.
Yea this probably break some assumptions, but I think we can fix that. If we keep track of the extraction path in the metadata we could avoid downloading again when the extraction path exists. |
|
I found a bug in my implementation, it's still redownloading the full archive. |
epwalsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you're still working on this but a couple general comments
| let mut existing_meta: Vec<Meta> = vec![]; | ||
| let glob_string = format!( | ||
| "{}.*.meta", | ||
| "{}*.meta", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this was intentional or not, but the glob pattern before wasn't matching against the .meta files. Changing to this fixed it, and the files stopped being downloaded again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, that looks like a bug. Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is super important to fix for offline mode, hope this PR can pass soon
epwalsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to be careful not to break existing workflows. I think removing the compressed cached file for remote resources is fine (when extract=true), but when the resource is a local path we shouldn't remove that file. I think this would be a quick fix, just need to add an argument to extract_archive that specifies whether to keep or delete that file, then set that accordingly in Cache::cached_path_with_options().
Also looks like there's a couple CI failures that should be easy to fix. Let me know if you have trouble there.
Other than that looks good to me!
| let mut existing_meta: Vec<Meta> = vec![]; | ||
| let glob_string = format!( | ||
| "{}.*.meta", | ||
| "{}*.meta", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, that looks like a bug. Good catch!
|
Makes total sense to me! I'll be back later with those changes, thank you for the reviews and guidance. |
|
This is related to #68, right? What's missing for this to work? I was trying to download a |