Skip to content

The repo to host an end-to-end workflow to run GenCast with GEOS-FP

License

Notifications You must be signed in to change notification settings

wmputman/GenCast_GEOS-FP

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

199 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenCast-FP end-to-end workflow

DOI Documentation Build Docker GitHub release (latest by date) Docker Image Version License

This workflow is to generate GenCast predictions with GEOS-FP as inputs. Follow the steps below to set up and run. The workflow currently only works on DISCOVER filesystems.

Quickstart

The following command runs preprocessing, prediction, and postprocessing for a given date range using the Discover A100 systems. You will need access to a single GPU to run this workflow. Note that the following command can be run from any Discover login node.

For a single day (end_date defaults to the same day):

sbatch --partition=gpu_a100 --constraint=rome --ntasks=10 --gres=gpu:1 --mem-per-gpu=100G -t 10:00:00 -J gencast-fp --wrap="module load singularity; singularity exec --nv -B $NOBACKUP,/css,/gpfsm/dmd/css,/nfs3m,/gpfsm /discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest gencast-fp run --start_date 2025-11-19:12 --output_dir /discover/nobackup/jacaraba/development/GenCast_FP/tests/gencast-run"

if you want to run for multiple past days:

sbatch --partition=gpu_a100 --constraint=rome --ntasks=10 --gres=gpu:1 --mem-per-gpu=100G -t 10:00:00 -J gencast-fp --wrap="module load singularity; singularity exec --nv -B $NOBACKUP,/css,/gpfsm/dmd/css,/nfs3m,/gpfsm /discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest gencast-fp run --start_date 2025-11-10:00 --end_date 2025-11-15:00 --output_dir /discover/nobackup/jacaraba/development/GenCast_FP/tests/gencast-run"

Example slurm file submission script:

#!/bin/bash
#SBATCH --partition=gpu_a100
#SBATCH --constraint=rome
#SBATCH --ntasks=10
#SBATCH --gres=gpu:1
#SBATCH --mem-per-gpu=100G
#SBATCH --time=10:00:00
#SBATCH --job-name=gencast-fp
#SBATCH --output=gencast-fp_%j.out
#SBATCH --error=gencast-fp_%j.err


# Load modules
source /usr/share/modules/init/bash
module purge
module load singularity

# Run the container
singularity exec --nv \
    -B $NOBACKUP,/css,/gpfsm/dmd/css,/nfs3m,/gpfsm \
    /discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest \
    gencast-fp run \
    --start_date 2025-11-20:00 \
    --output_dir /discover/nobackup/jacaraba/development/GenCast_FP/tests/gencast-run

Making your own changes

If you want to do additional development not available in the container package, we recommend for you to Fork this repository and point to the new changes as a PYTHONPATH variable inside the container.

Assuming you clone the sofware to Discover in the path /discover/nobackup/myusername/GenCast_FP, you would need to change your container argument to use the --env as:

sbatch --partition=gpu_a100 --constraint=rome --ntasks=10 --gres=gpu:1 --mem-per-gpu=100G -t 10:00:00 -J gencast-fp --wrap="module load singularity; singularity exec --nv -B $NOBACKUP,/css,/gpfsm/dmd/css,/nfs3m,/gpfsm --env PYTHONPATH="/discover/nobackup/myusername/GenCast_FP" /discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest gencast-fp run --start_date 2025-11-20:00 --output_dir /discover/nobackup/jacaraba/development/GenCast_FP/tests/gencast-run"

In the event where you get an error related to a Fortran library not being available, you will need to run the following command to add the compiled Cython binary to your path where you are making your code modifications.

singularity exec --nv -B $NOBACKUP,/css,/gpfsm/dmd/css,/nfs3m,/gpfsm --env PYTHONPATH="/discover/nobackup/myusername/GenCast_FP" /discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest cp /opt/GenCast_FP/gencast_fp/preprocess/eta2xprs_.cpython-310-x86_64-linux-gnu.so /discover/nobackup/myusername/GenCast_FP/gencast_fp/preprocess

Dependencies

Additional details and flexibility of the commands are listed below. A container has been made available here:

/discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest

Downloading the Container

If you would like to download the container yourself, you will need to run the following command. The latest version has the most up to date changes, while specific releases are attached to a given version from the repository.

Latest Release

singularity build --sandbox gencast-fp-latest docker://nasanccs/gencast-fp:latest

Specific Version

singularity build --sandbox gencast-fp-0.2.0 docker://nasanccs/gencast-fp:0.2.0

A version of this container is located at:

/discover/nobackup/projects/QEFM/containers/gencast-fp-containers/gencast-fp-latest

Pipeline Details

In addition, individual steps of the pipeline can be run using the container and CLI. Some examples with arguments are listed below. The pipeline has 3 steps: preprocess, predict, and postprocess. While we advice to run the full pipeline, sometimes is easier to develop in stages.

Preprocessing

usage: gencast_fp_cli.py preprocess [-h] --start_date START_DATE --end_date END_DATE [--output_dir OUTPUT_DIR] [--expid EXPID] [--res_value RES_VALUE] [--nsteps NSTEPS]

options:
  -h, --help            show this help message and exit
  --start_date START_DATE
                        Start date to process (YYYY-MM-DD:HH)
  --end_date END_DATE   End date to process (YYYY-MM-DD:HH)
  --output_dir OUTPUT_DIR
                        Output directory for preprocessed files
  --expid EXPID         Experiment ID for the output files
  --res_value RES_VALUE
                        Resoluton (default 1.0 resolution)
  --nsteps NSTEPS       Number of steps for rollout (default 30, 15 days)

Prediction

usage: gencast_fp_cli.py predict [-h] --start_date START_DATE --end_date END_DATE --input_dir INPUT_DIR --output_dir OUTPUT_DIR [--ckpt CKPT] [--nsteps NSTEPS] [--res RES] [--ensemble ENSEMBLE]
                                 [--container_meta CONTAINER_META]

options:
  -h, --help            show this help message and exit
  --start_date START_DATE
                        YYYY-MM-DD
  --end_date END_DATE   YYYY-MM-DD
  --input_dir INPUT_DIR, -i INPUT_DIR
                        Preprocessed input directory
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Where to write predictions
  --ckpt CKPT           Path to GenCast .npz checkpoint (overrides container default)
  --nsteps NSTEPS
  --res RES
  --ensemble ENSEMBLE
  --container_meta CONTAINER_META
                        Where to load default ckpt/configs if --ckpt not passed

Postprocessing

usage: gencast_fp_cli.py postprocess [-h] --start_date START_DATE --end_date END_DATE --input_dir INPUT_DIR --predictions_dir PREDICTIONS_DIR [--output_dir OUTPUT_DIR] [--no_ens_mean]

options:
  -h, --help            show this help message and exit
  --start_date START_DATE
                        Start date (YYYY-MM-DD)
  --end_date END_DATE   End date (YYYY-MM-DD)
  --input_dir INPUT_DIR
                        Directory with GEOS inputs (for initial conditions)
  --predictions_dir PREDICTIONS_DIR
                        Directory with GenCast predictions
  --output_dir OUTPUT_DIR
                        Directory for CF-compliant NetCDF outputs
  --no_ens_mean         Disable ensemble mean (keep all ensemble members)

About

The repo to host an end-to-end workflow to run GenCast with GEOS-FP

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.9%
  • Fortran 1.4%
  • Other 0.7%