-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Trying to generate an offline report for a radiative flux model. The test data (gs://vcm-ml-intermediate/2023-01-09/prescribed-radiative-fluxes-for-training-rad-flux-model.zarr) does not include pressure_thickness_of_atmospheric_layer (since not necessary for inputs or outputs) but this seems to cause a KeyError when trying to compute offline diagnostics. Traceback indicates that it is related to vcm.DerivedMapping.
Don't have minimal reproducer, but can try to make one if it's helpful.
Traceback:
+ python -m fv3net.diagnostics.offline.compute gs://vcm-ml-experiments/default/2023-01-10/rad-flux-fine-only-ml-trial-0/trained_models/radiative_fluxes test_data.yaml gs://vcm-ml-experiments/default/2023-01-10/rad-flux-fine-only-ml-trial-0/offline_diags/radiative_fluxes
2023-01-10 22:23:17.975636: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-01-10 22:23:17.975692: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
offline_diags 2023-01-10 22:23:35,199: compute/L296 Starting diagnostics routine.
offline_diags 2023-01-10 22:23:39,092: compute/L309 Opening ML model
2023-01-10 22:23:39.379723: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-01-10 22:23:39.379874: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-01-10 22:23:39.379933: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (test-train-eval-prog-18217ca22fb5-3012164664): /proc/driver/nvidia/version does not exist
2023-01-10 22:23:39.380322: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
tensorflow 2023-01-10 22:23:40,903: load/L167 No training configuration found in save file, so the model was *not* compiled. Compile it manually.
KeyError: 'pressure_thickness_of_atmospheric_layer'
variable = self._variables[name]
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/xarray/core/dataset.py", line 1398, in _construct_dataarray
Traceback (most recent call last):
During handling of the above exception, another exception occurred:
KeyError: 'pressure_thickness_of_atmospheric_layer'
ref_var = variables[ref_name]
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/xarray/core/dataset.py", line 173, in _get_virtual_variable
_, name, variable = _get_virtual_variable(
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/xarray/core/dataset.py", line 1400, in _construct_dataarray
return self._construct_dataarray(key)
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/xarray/core/dataset.py", line 1502, in __getitem__
return self._mapper[key]
File "/home/jovyan/fv3net/external/vcm/vcm/derived_mapping.py", line 66, in __getitem__
return {key: self[key] for key in keys}
File "/home/jovyan/fv3net/external/vcm/vcm/derived_mapping.py", line 78, in <dictcomp>
return {key: self[key] for key in keys}
File "/home/jovyan/fv3net/external/vcm/vcm/derived_mapping.py", line 78, in _data_arrays
return xr.Dataset(self._data_arrays(keys))
File "/home/jovyan/fv3net/external/vcm/vcm/derived_mapping.py", line 81, in dataset
return derived_mapping.dataset(variables)
File "/home/jovyan/fv3net/external/loaders/loaders/_utils.py", line 106, in add_derived_data
return self._partial(*args, **kwargs)
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/toolz/functoolz.py", line 303, in __call__
ret = f(ret)
File "/opt/conda/envs/fv3net/lib/python3.8/site-packages/toolz/functoolz.py", line 488, in __call__
return self._func(self._args[item])
File "/home/jovyan/fv3net/external/loaders/loaders/batches/_sequences.py", line 143, in __getitem__
return self._func(self._args[item])
File "/home/jovyan/fv3net/external/loaders/loaders/batches/_sequences.py", line 143, in __getitem__
v = self[i]
File "/opt/conda/envs/fv3net/lib/python3.8/_collections_abc.py", line 874, in __iter__
for i, batch in enumerate(batches):
File "/home/jovyan/fv3net/workflows/diagnostics/fv3net/diagnostics/offline/compute.py", line 288, in _daskify_sequence
concatted_batches = _daskify_sequence(batches)
File "/home/jovyan/fv3net/workflows/diagnostics/fv3net/diagnostics/offline/compute.py", line 281, in get_prediction
ds_predicted = get_prediction(
File "/home/jovyan/fv3net/workflows/diagnostics/fv3net/diagnostics/offline/compute.py", line 316, in main
main(args)
File "/home/jovyan/fv3net/workflows/diagnostics/fv3net/diagnostics/offline/compute.py", line 402, in <module>
exec(code, run_globals)
File "/opt/conda/envs/fv3net/lib/python3.8/runpy.py", line 87, in _run_code
return _run_code(code, main_globals, None,
File "/opt/conda/envs/fv3net/lib/python3.8/runpy.py", line 194, in _run_module_as_main
Traceback (most recent call last):
Metadata
Metadata
Assignees
Labels
No labels