Climate Prediction Center - wgrib2: -new

www.nws.noaa.gov

About Us



Contact Us

HOME > Monitoring_and_Data > Oceanic and Atmospheric Data > Reanalysis: Atmospheric Data > wgrib2-new_grid usage

wgrib2: -new_grid usage

Introduction

In an operational center, the model output has to be converted from the model grid to the various user grids. Since the needs of the users are diverse, there are going to be many different grids. This page will show the steps for a fast (wall clock) interpolation.

Step 1: Combine the Vector Quantities

There are two basic types of interpolations, scalar and vector. Processing can be made faster by combining the vectors into their own grib message. The following script will use wgrib2 to put the various vectors into their own grib message.

#!/bin/sh

vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"

wgrib2 $1 | sed -e 's/:UGRD:/:UGRDu:/' -e 's/:VGRD:/:UGRDv:/' \
     -e 's/:VUCSH:/:VUCSHu:/' -e 's/:VVCSH:/:VUCSHv:/' \
     -e 's/:UFLX:/:UFLXu:/' -e 's/:VFLX:/:UFLXv:/' \
     -e 's/:UGUST:/:UGUSTu:/' -e 's/:VGUST:/:UGUSTv:/' \
     -e 's/:USTM:/:USTMu:/' -e 's/:VSTM:/:USTMv:/' \
     -e 's/:VDFUA:/:VDFUAu:/' -e 's/:VDFVA:/:VDFUAv:/' \
     -e 's/:MAXUW:/:MAXUWu:/' -e 's/:MAXVW:/:MAXUVv:/' \
     -e 's/:UOGRD:/:UOGRDu:/' -e 's/:VOGRD:/:UOGRDv:/' \
     -e 's/:UICE:/:UICEu:/' -e 's/:VICE:/:UICEv:/' \
| sort -t: -k3,3 -k5 -k4,4 | \
        wgrib2 -i -mem_init 0 $1 @mem:0 -new_grid_vectors "$vectors" -submsg_uv $2.tmp

One optimization is "-mem_init 0 $1" in the last wgrib2. This option reads the file $1 and saves into the memory file "@mem:0". Then wgrib2 processes the memory file which is much faster than the disk file when doing random access.

Step 2a: Slow

A typical, one thread, interpolation will look like this,

config="-new_grid_winds earth -new_grid_interpolation bilinear -if :(VGTYP|SOTYP): -new_grid_interpolation neighbor -fi"
vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"
grid="latlon 0:360:1 -90:181:1"

wgrib2 $2.tmp $config -new_grid_vectors "$vectors" -new_grid $grid $2

Step 2b: Faster, Non-operational

Step 1 puts the vector quantities into their own grib message. After Step 1, you can interpolate all the grib messages independently. This embarrassingy parallel job can be run using the wgrib2mv. Wgrib2mv is an easy-to-use perl script that runs N copies of wgrib2 and combines the results. Converting the above wgrib2 script into a parallel command is easy.

config="-new_grid_winds earth -new_grid_interpolation bilinear -if :(VGTYP|SOTYP): -new_grid_interpolation neighbor -fi "
vectors="UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE"
grid="latlon 0:360:1 -90:181:1"

wgrib2mv 2 $2.tmp $config -new_grid_vectors "$vectors" -new_grid $grid $2

The only difference between the two script fragments is that "wgrib2" is replaced by "wgrib2mv N". (N is 2 for this example.) By using wgrib2mv, you are breaking the interpolation into 2 pieces. Each wgrib2 does 50% of the interpolations and the program "gmerge" combines the output from the 2 copies of wgrib2. Wgrib2mv does trivial parallelism and is limited by the number of cores on the node, the speed of the pipes and I/O speed.

Step 2c: Faster, Operational

Step 2b works on my linux box, why isn't it ready for operations? The perl script, wgrib2mv, creates temporary files on /tmp which aren't allowed in NCEP operational jobs. Some HPC machines require you to run the executables on a compute node. For example, the cray machines require you to use "aprun". Finally the output of wgrib2mv has all the vector quantities in their own grib message. This unusual configuration will have to be changed for operations.

The first part is to modify the script from "wgrib2mv 2" to "wgrib2mv -2". (2 is used in my example.) Now wgrib2mv will write a script fragment,

export OMP_NUM_THREADS=1
mkfifo /tmp/11938.pipe.1.1 /tmp/11938.pipe.1.1.b
wgrib2 -inv /dev/null /tmp/11938.pipe.1.1 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:
 USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" -submsg_uv /tmp/11938.pipe.1.1.b &
mkfifo /tmp/11938.pipe.2.1 /tmp/11938.pipe.2.1.b
wgrib2 -inv /dev/null /tmp/11938.pipe.2.1 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:
 USTM:VSTM:VDFUA:VDFVA:MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" -submsg_uv /tmp/11938.pipe.2.1.b &
gmerge junk.mv  /tmp/11938.pipe.1.1.b /tmp/11938.pipe.2.1.b &
wgrib2 -for 1::2 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:
 MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" "junk.tmp" "-new_grid_winds" "earth" "-new_grid_interpolation"
 "bilinear" "-if" ":(VGTYP|SOTYP):" "-new_grid_interpolation" "neighbor" "-fi" "-new_grid" "latlon" "0:360:1" 
 "-90:181:1" /tmp/11938.pipe.1.1 &
wgrib2 -for 2::2 -new_grid_vectors "UGRD:VGRD:VUCSH:VVCSH:UFLX:VFLX:UGUST:VGUST:USTM:VSTM:VDFUA:VDFVA:
 MAXUV:MAXVW:UOGRD:VOGRD:UICE:VICE" "junk.tmp" "-new_grid_winds" "earth" "-new_grid_interpolation"
 "bilinear" "-if" ":(VGTYP|SOTYP):" "-new_grid_interpolation" "neighbor" "-fi" "-new_grid" "latlon" "0:360:1" 
 "-90:181:1" /tmp/11938.pipe.2.1 &
wait
rm  /tmp/11938.pipe.1.1 /tmp/11938.pipe.1.1.b /tmp/11938.pipe.2.1 /tmp/11938.pipe.2.1.b

The number 11938 is the PID (process ID) that ran the wgrib2mv script. The PID was used to create a unique filename for the various pipes. However, on a multi-node machine, the PID is no longer a unique value and a different scheme will have to be used to create unique filenames.

The script fragment should be considered a starting point to writing a job that will run in operations. On a cray HPC, you will have to translate the various "(wgrib2/gmerge) ... &" in to a single aprun command.