Add Wildfire Risk to Communities Burn Probability to data catalog

The USFS has generated US wide estimates of burn probability estimates, based on vegetation and wildland fuel data from LANDFIRE 2014. Having these data in their raw resolution (30m) and downsampled resolution (4000m?) would help with on-going work to evaluate permanence risk to forest carbon. These data would also help in evaluating the accuracy of our [MTBS fire risk modeling](https://github.com/carbonplan/forest-risks). 

To accomplish this task, we need to i) download all the burn probability BP) data ii) stitch it all together in a single file iii) do some downsampling/regridding and iv) save the end product to somewhere we can then access the data. 

In the past, we've sort of [rolled our own data processing](https://github.com/carbonplan/data/tree/main/scripts/mtbs). More recently, we've done a bunch of stuff for the CMIP6 projects with [prefect](https://github.com/carbonplan/cmip6-downscaling). And separately, I'm aware of on-going efforts for similar data transformations with [Pangeo-Forge](https://pangeo-forge.readthedocs.io/en/latest/what_is_pangeo_forge.html). It would be helpful to get feedback from @orianac @jhamman and @norlandrhagen about the best way to accomplish the task. 

Here are some other details and questions that can get the conversation started. 

# Data

## Input 
Raw 30m GeoTIFF are available directly from the [USFS Research Data Archive](https://www.fs.usda.gov/rds/archive/Catalog/RDS-2020-0016). Data are organized within a zipfile on a per-state basis, with each file containing eight separate data products. We're interested in the `Burn Probability` data.  File sizes range from 100MB to 20GB. 

I think we probably want to separately download these data and archive them on our cloud storage. Thoughts @jhamman ?

## Output 

### Format

Our target output should be either:
- a big GeoTIFF (similar to how we store [NLCD data](https://github.com/carbonplan/data/blob/883122cac2bc06965d95ce7b693c1dcb8d04e5aa/carbonplan_data/catalogs/nlcd.yaml))
- a Zarr store (similar to how we store [MTBS data](https://github.com/carbonplan/data/blob/883122cac2bc06965d95ce7b693c1dcb8d04e5aa/carbonplan_data/catalogs/mtbs.yaml))

We should store the data in two resolutions:
- native 30m
- downsampled 4000m

We'll need to handle CONUS and AK, which I think requires separate files? @jhamman 

### Location
Where should the final zarr/tiff(s) live? I think we've historically started with google cloud storage so I guess we start by pushing the data there. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add Wildfire Risk to Communities Burn Probability to data catalog #156

Data

Input

Output

Format

Location

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add Wildfire Risk to Communities Burn Probability to data catalog #156

Description

Data

Input

Output

Format

Location

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions