wgrib2: -new_grid usage
Introduction
In an operational center, the model output has to be converted
from the model grid to the various user grids. Since the needs of
the users are diverse, there are going to be many different grids.
This page will show the steps for a fast (wall clock) interpolation.
Step 1: Combine the Vector Quantities
There are two basic types of interpolations, scalar and vector.
Processing can be made faster by combining the vectors into
their own grib message. The following script will
use wgrib2 to put the various vectors into their own grib message.
#!/bin/sh
vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"
wgrib2 $1 | sed -e 's/:UGRD:/:UGRDu:/' -e 's/:VGRD:/:UGRDv:/' \
-e 's/:VUCSH:/:VUCSHu:/' -e 's/:VVCSH:/:VUCSHv:/' \
-e 's/:UFLX:/:UFLXu:/' -e 's/:VFLX:/:UFLXv:/' \
-e 's/:UGUST:/:UGUSTu:/' -e 's/:VGUST:/:UGUSTv:/' \
-e 's/:USTM:/:USTMu:/' -e 's/:VSTM:/:USTMv:/' \
-e 's/:VDFUA:/:VDFUAu:/' -e 's/:VDFVA:/:VDFUAv:/' \
-e 's/:MAXUW:/:MAXUWu:/' -e 's/:MAXVW:/:MAXUVv:/' \
-e 's/:UOGRD:/:UOGRDu:/' -e 's/:VOGRD:/:UOGRDv:/' \
-e 's/:UICE:/:UICEu:/' -e 's/:VICE:/:UICEv:/' \
| sort -t: -k3,3 -k5 -k4,4 | \
wgrib2 -i -mem_init 0 $1 @mem:0 -new_grid_vectors "$vectors" -submsg_uv $2.tmp
One optimization is "-mem_init 0 $1" in the last wgrib2. This option
reads the file $1 and saves into the memory file "@mem:0". Then wgrib2 processes the memory
file which is much faster than the disk file when doing random access.
Step 2a: Slow
A typical, one thread, interpolation will look like this,
config="-new_grid_winds earth -new_grid_interpolation bilinear -if :(VGTYP|SOTYP): -new_grid_interpolation neighbor -fi"
vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"
grid="latlon 0:360:1 -90:181:1"
wgrib2 $2.tmp $config -new_grid_vectors "$vectors" -new_grid $grid $2
Step 2b: Faster, Non-operational
Step 1 puts the vector quantities into their own grib message. After Step 1, you can interpolate
all the grib messages independently. This embarrassingy parallel job can be run using the wgrib2mv.
Wgrib2mv is an easy-to-use perl script that runs N copies of wgrib2 and combines the results. Converting
the above wgrib2 script into a parallel command is easy.
config="-new_grid_winds earth -new_grid_interpolation bilinear -if :(VGTYP|SOTYP): -new_grid_interpolation neighbor -fi "
vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"
grid="latlon 0:360:1 -90:181:1"
wgrib2mv 2 $2.tmp $config -new_grid_vectors "$vectors" -new_grid $grid $2
The only difference between the two script fragments is that "wgrib2" is replaced by "wgrib2mv N".
(N is 2 for this example.)
By using wgrib2mv, you are breaking the interpolation into 2 pieces. Each wgrib2 does 50% of the
interpolations and the program "gmerge" combines the output from the 2 copies of wgrib2.
Wgrib2mv does trivial parallelism and is limited by the number of cores on the node, the speed of
the pipes and I/O speed.
Step 2c: Faster, Operational
Step 2b works on my linux box, why isn't it ready for operations? The perl script, wgrib2mv,
creates temporary files on /tmp which aren't allowed in NCEP operational jobs. Some HPC
machines require you to run the executables on a compute node. For example, the cray machines
require you to use "aprun". Finally the output of wgrib2mv has all the vector quantities
in their own grib message. This unusual configuration will have to be changed for
operations.
The first part is to modify the script from "wgrib2mv 2" to "wgrib2mv -2". (2 is used
in my example.) Now wgrib2mv will write a script fragment,
export OMP_NUM_THREADS=1
mkfifo /tmp/11938.pipe.1.1 /tmp/11938.pipe.1.1.b
wgrib2 -inv /dev/null /tmp/11938.pipe.1.1 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:
USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" -submsg_uv /tmp/11938.pipe.1.1.b &
mkfifo /tmp/11938.pipe.2.1 /tmp/11938.pipe.2.1.b
wgrib2 -inv /dev/null /tmp/11938.pipe.2.1 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:
USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" -submsg_uv /tmp/11938.pipe.2.1.b &
gmerge junk.mv /tmp/11938.pipe.1.1.b /tmp/11938.pipe.2.1.b &
wgrib2 -for 1::2 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:
MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" "junk.tmp" "-new_grid_winds" "earth" "-new_grid_interpolation"
"bilinear" "-if" ":(VGTYP|SOTYP):" "-new_grid_interpolation" "neighbor" "-fi" "-new_grid" "latlon" "0:360:1"
"-90:181:1" /tmp/11938.pipe.1.1 &
wgrib2 -for 2::2 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:
MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" "junk.tmp" "-new_grid_winds" "earth" "-new_grid_interpolation"
"bilinear" "-if" ":(VGTYP|SOTYP):" "-new_grid_interpolation" "neighbor" "-fi" "-new_grid" "latlon" "0:360:1"
"-90:181:1" /tmp/11938.pipe.2.1 &
wait
rm /tmp/11938.pipe.1.1 /tmp/11938.pipe.1.1.b /tmp/11938.pipe.2.1 /tmp/11938.pipe.2.1.b
The number 11938 is the PID (process ID) that ran the wgrib2mv script. The PID was
used to create a unique filename for the various pipes. However, on a multi-node
machine, the PID is no longer a unique value and a different scheme will have to
be used to create unique filenames.
The script fragment should be considered a starting point to writing
a job that will run in operations. On a cray HPC, you will have to translate
the various "(wgrib2/gmerge) ... &" in to a single aprun command.
See also: -mem_final,
wgrib2mv
|