Running 4DVar application in JEDI (on Discover)#
Loading Modules#
To Load Spack Stack 1.9 modules with GNU compiler on Discover run:
#!/bin/bash
echo "Loading EWOK-SKYLAB Environment Using Spack-Stack 1.9.0"
# load modules
module purge
module use /discover/swdev/gmao_SIteam/modulefiles-SLES15
module use /discover/swdev/jcsda/spack-stack/scu17/modulefiles
module use /gpfsm/dswdev/jcsda/spack-stack/scu17/spack-stack-1.9.0/envs/ue-gcc-12.3.0/install/modulefiles/Core
module load stack-gcc/12.3.0
module load stack-openmpi/4.1.6
module load stack-python/3.11.7
module load singularity
# Discover compiler modules set environment variable COMPILER, need to repair for R2D2
export COMPILER=gnu
module load jedi-fv3-env
module load ewok-env
To Load Spack Stack 1.7.0 modules with GNU compiler on Discover run:
#!/bin/bash
echo "Loading EWOK-SKYLAB Environment Using Spack-Stack 1.7.0 GNU SCU17"
# load modules
module purge
module use /discover/swdev/gmao_SIteam/modulefiles-SLES15
module use /discover/swdev/jcsda/spack-stack/scu17/modulefiles
module load ecflow/5.11.4
module use /gpfsm/dswdev/jcsda/spack-stack/scu17/spack-stack-1.7.0/envs/ue-gcc-12.3.0/install/modulefiles/Core
module load stack-gcc/12.3.0
module load stack-openmpi/4.1.6
module load stack-python/3.10.13
# Discover compiler modules set environment variable COMPILER, need to repair for R2D2
export COMPILER=gnu
module load jedi-fv3-env
module load ewok-env
module load sp
# To build more expensive fv3-jedi (tier 2) tests
#export FV3JEDI_TEST_TIER=2
Check here for the latest Spack-Stack modules on Discover. Note that this is a JCSDA private repository.
To build the jedi-bundle follow the insturctions here.
For a (slightly) faster build, comment out these repos in jedi-bundle/CMakeLists.txt: MOM6, soca, MPAS-Model, mpas-jedi, and coupling.
YAML Structure#
The JEDI code is under active development, and some of the YAML keys used in the examples may change over time. As a result, this document may become outdated. It’s important for users to understand both the structure of the YAML files and the meaning of the keys, while also consulting the latest ctest examples in the JEDI repositories.
This section provides an overview of the different components within the 4DVar YAML files.
/discover/nobackup/mabdiosk/garage/applications/var-app includes the YAML files for running different 4dvar cases, the input files, and a run script (run_4dvar.sh).
4dvar_geos-cf_fv3lm_c24_p12.yaml is a 4dvar experiment example at C24 resolution.
cost function:
cost type: 4D-Var
time window:
begin: 2021-08-05T03:00:00Z #always beginning of the window
length: PT6H
In this example, the assimilation window is from 2021-08-05 03Z to 2021-08-05 09Z.
The beginning of the time window does not depend on the DA method (cost type) and is always set to the beginning of the assimilation window.
The length of the window is typically set to 6H.
model:
name: FV3LM
namelist filename: input/geometry_input/input_geos_c24_p12.nml
tstep: PT15M
filetype: cube sphere history
lm_do_dyn: 1
lm_do_trb: 0
lm_do_mst: 0
model variables: &modelvars [ud,vd,ua,va,T,DELP,SPHU,qi,ql,NO2]
In 4DVar, you need to compute (or pre-compute) the model state at every tstep within the assimilation window. Ideally, you want to run the full model and generate the model state at every tstep, but this can be costly. To reduce cost, you can 1) compute the model state using a simplified model such as FV3LM or 2) read the pre-computed model state using PSEUDO). In this example, we are using FV3LM. See 4dvar_geos-cf_pseudo.yaml for a PSEUDO example.
FV3LM model requires ud,vd,ua,va,T,DELP,SPHU,qi,ql to be on the model variables list (and be available in the background files). Trace gas and aerosol variables can be added to this list and will be treated as tracers (only get transported, similar to moisture).
Linear turbulence scheme and linear moist physics are turned off.
analysis variables: [eastward_wind,
northward_wind,
air_temperature,
air_pressure_thickness,
specific_humidity,
cloud_liquid_ice,
cloud_liquid_water,
volume_mixing_ratio_of_no2]
The list of variables you want to assimilate. These will be available in the analysis output. Note that the first 7 variables (everything except no2) are required to be on the list.
geometry:
fms initialization:
namelist filename: input/geometry_input/fmsmpp.nml
akbk: input/geometry_input/akbk72.nc4
npx: 25
npy: 25
npz: 72
layout: [1,2]
field metadata override: input/geometry_input/geos_cf_ewok.yaml
This section is repeated a few times in this YAML file and always exists in almost all of the JEDI YAML files. Here, you can define the geometry (grid setup) of your background (input) files. We have background files at C24 resolution with 72 vertical levels.
layout is another important setting that can change depending on the number of processors you use to run the application. With the layout of [1,2] (for each tile), you must use “1x2x6” or “12” processors (there are 6 tiles in total). If the layout is [2,2] you will need “2x2x6” or “24” processors.
field metadata override is a list that maps the variables in the background files to JEDI. long name in this file refers to the variable name in fv3-jedi and io name refers to the variable name in the background (input) files.
background:
datetime: 2021-08-05T03:00:00Z #background beginning of the window
filetype: cube sphere history
datapath: input/bg/geoscf_c24_ewok
filename: GCv14.0_GCMv1.17_c24.geoscf_jedi.%yyyy%mm%ddT%hh%MM%ssZ.nc4
state variables: [ud,vd,ua,va,T,SPHU,qi,ql,DELP,NO2,phis]
In 4DVar, the background must be available at the beginning of the assimilation window.
state variables is the list of variables that JEDI is going to read from the background files and put in “state”. Anything that is listed under model variables must be under state variables too.
background error:
covariance model: SABER
saber central block:
saber block name: ID
In this example, for simplicity, we are using an Identity matrix for background error.
observations:
observers:
- obs space:
name: NO2
obsdatain:
engine:
type: H5File
obsfile: input/obs/obs.tropomi_s5p_no2_tropo.2021-08-05T060000Z.nc4
obsdataout:
engine:
type: H5File
obsfile: output/fb.4dvar.c24.tropomi_s5p_no2_tropo.20210805T060000Z.nc
simulated variables: [nitrogendioxideColumn]
obs operator:
name: ColumnRetrieval
nlayers_retrieval: 34
tracer variables: [volume_mixing_ratio_of_no2]
isApriori: false
isAveragingKernel: true
stretchVertices: topbottom #options: top, bottom, topbottom, none
obs error:
covariance model: diagonal
get values:
time interpolation: linear
Observation, observation operator specification is listed here. Note that obsfile under obsdatain points to a tropomi no2 observations file. This file includes measurements from 03Z to 09Z, which is throughout our assimilation window.
The output file specified under obsdataout is the feedback file. It includes observation values and model (background) values at observation location/time or hofx0 values. hofx1 is the analysis values (after one outer loop iteration) at observation location/time. oman is (observation - analysis) and ombg is (observation - background).
final:
diagnostics:
departures: oman
analysis to latlon:
local interpolator type: oops unstructured grid interpolator
resolution in degrees: 15.0 # low resolution for testing
variables to output: [volume_mixing_ratio_of_no2]
#pressure levels in hPa: [500]
model levels: [71]
#bottom model level: true
frequency: PT3H
datapath: output
exp: 4dvar.c24
type: an
output:
filetype: cube sphere history
provider: geos
datapath: output/
filename: ana.4dvar.c24.%yyyy%mm%dd_%hh%MM%ssz.nc4
first: PT0H
frequency: PT6H
It is possible to write out analysis (and increment) output in lat/lon grid at specific model or pressure level by setting analysis to latlon under the final section. The lat/lon filename will have a prefix of exp and suffix of latlon.modelLevels.nc
output filename will be 4dvar.c24.an.*Z.latlon.modelLevels.nc
The frequency of analysis output can be set as low as tstep under model. The frequency of PT6H means that there will be two analysis files generated at the beginning and beginning + 6H (end) of the window.
variational:
minimizer:
algorithm: DRPCG
iterations:
- ninner: 2
gradient norm reduction: 1e-10
test: on
geometry:
akbk: input/geometry_input/akbk72.nc4
npx: 25
npy: 25
npz: 72
layout: [1,2]
field metadata override: input/geometry_input/geos_cf_ewok.yaml
diagnostics:
departures: ombg
linear model:
name: FV3JEDITLM
namelist filename: input/geometry_input/input_geos_c24_p12.nml
linear model namelist filename: input/geometry_input/inputpert_4dvar.nml
tstep: PT15M
tlm variables: *modelvars
lm_do_dyn: 1
lm_do_trb: 0
lm_do_mst: 0
trajectory:
model variables: *modelvars
This section sets up the minimizer for the 4dvar experiment.
ninner is the number of iterations in the inner loop. For testing is set to 2. In scientific experiments, it is usually set to a larger number like 100. gradient norm reduction is the threshold for convergance.
In JEDI, you can run the minimizer in a different (coarser) resolution to reduce the computation cost. Here it is set to the same resolution as the analysis.
FV3JEDITLM is the adjoint of the tangent linear model used for iterative minimization of the cost function. namelist filename changes with model resolution and layout. Make sure what is specified in this file matches the 4dvar YAML.
tstep for FV3JEDITLM cannot be smaller than tstep in FV3LM (or PSEUDO) meaning that you need to have model states available at every step of FV3JEDITLM.
In this example, tlm variables and model variables under trajectory are the same as model variables under FV3LM.
To run this application, log into a compute node
export JEDIBUILD=/discover/nobackup/mabdiosk/jedi-bundle/build-gnu-spack-1.7.0/bin/
export WORKDIR=/gpfsm/dnb33/mabdiosk/garage/applications/var-app
/discover/swdev/gmao_SIteam/MPI/openmpi/4.1.6-SLES15/gcc-12.3.0/bin/mpiexec "-n" "12" "$JEDIBUILD/fv3jedi_var.x" "$WORKDIR/4dvar_geos-cf_fv3lm_c24_p12.yaml"
The JEDI executable to run variational applications (3D or 4DVar) is fv3jedi_var.x
Note that here we are requesting 12 processors.