-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Fiboa datasets are currently published as .parquet files of varying sizes. Some are 100s of MB, some are many GB. This makes it difficult to perform further processing on the data as each file requires different amounts of memory to process. Chunking up the data into multiple files, each of similar size, could help here.
Some of the larger fiboa datasets (ex. japan) take ~120-150GB of memory to process which becomes quite expensive and unwieldy. It probably makes sense to roll out this change along with the move to a dedicated source repo - fiboa/data#51
Metadata
Metadata
Assignees
Labels
No labels