Skip Navigation Links www.nws.noaa.gov 
NOAA logo - Click to go to the NOAA home page National Weather Service   NWS logo - Click to go to the NWS home page
Climate Prediction Center
 
 

 
About Us
   Our Mission
   Who We Are

Contact Us
   CPC Information
   CPC Web Team

 
HOME > Monitoring_and_Data > Oceanic and Atmospheric Data > Reanalysis: Atmospheric Data > wgrib2ms
 

wgrib2ms

Introduction

Wgrib2 was designed to be parallelized by what-may-be-called dataflow programming. Data flows into a black box and data flows out. One way to parallelize is to divide the data flow into N streams, process each stream separately and then recombine the streams at the end of the processing. Wgrib2ms parallelizes wgrib2 this way. The limitation of this parallelization is that it uses pipes and is limited by pipe speed, disk speed, the number of CPUs on a node/cpu and the overhead of setting up and running the parallel job. Pipe speed can be increased by increasing the pipe buffer size (linux kernel 2.6.35+).

Wgrib2ms parallelizes a wgrib2 command by dividing the data flow into N streams which are processed independently. Only a limited number of output options are supported. Note that the inventory from wgrib2ms is in a different order than the inventory from a wgrib2 command. Each grid (submessage) is processed by send it to one of the N streams. Since -new_grid requires that vector fields to be processed in order, this division of labor is incompatible with -new_grid. Any wgrib2 option that requires an order of processing is incmopatible with wgrib2ms.

wgrib2 output options supported by wgrib2m

  1. -grib
  2. -grib_out
  3. -ijsmall_grib
  4. -new_grid
  5. all other output options should not be used

wgrib2ms restrictions on the output options

  1. Each output option must write to a different file
  2. Each output option must write to the output file for every record processed.
  3. You can use the -match option because -match selects the record prior to processing
  4. You cannot use -if to select the record to be output (see restriction 2)
  5. Output options can only write grib (ex. -netcdf, -cvs are not allowed)

wgrib2 reading options supported by wgrib2ms

  1. processing a regular grib file (not a pipe)
  2. -i (reading inventory from stdin) added v1.1
  3. -import will cause problems

wgrib2 options that work differently in wgrib2ms

Some options still work but may behave differently in wgrib2ms. Since the processing is split in to N streams, each copy of wgrib2 will not see all the records. For example, you may want to calculate the 1000mb-500mb thickness. If one copy of wgrib2 gets the 1000 mb Z and other one gets the 500 mb Z, then you can't calculate the thinkness. This will affect

  1. -rpn
  2. -import

Usage

wgrib2ms N (wgrib2 subset options)
  for N > 1, execute wgrib2 (wgrib2 subset options) in N streams
  for N < -1, produces script running -N streams

v1.1+
  grep ":HGT:" nam.idx | wgrib2ms 3 -i nam.grb2 -set_grib_type c3 -grib_out HGT.c3

Example

wgrib2ms 4 IN.grb -set_grib_type c3 -new_grid_winds -new_grid ncep grid 221 out22.grb -new_grid ncep grep 3 out3.grb

Observations

Using Centos 6.4 on a FX 8320 (8 core), there was little speed up with N > 4 when using 1 MB grib messages. Using grib messages < 64KB (pipe buffer size), the processing scaled better with the number of streams. The program, gmerge, should be written to be multi-threading.

Code location: https://www.ftp.cpc.ncep.noaa.gov/wd51we/wgrib2_aux_progs/wgrib2m

See also: wgrib2m


NOAA/ National Weather Service
National Centers for Environmental Prediction
Climate Prediction Center
5830 University Research Court
College Park, Maryland 20740
Climate Prediction Center Web Team
Page last modified: Aug, 2014
Disclaimer Privacy Policy