How to convert data for Geogrid

From openwfm
Revision as of 03:31, 23 March 2010 by Jbeezley (talk | contribs)
Jump to navigation Jump to search

Running WRF-Fire using real data requires additional datasets not included in the standard WPS input data tarball. For these, it is necessary to convert your source data into the geogrid data format. The geogrid data format consists of a directory of tiled binary files with names indicating the index range contained in each tile. For instance, a file name of 00101-00200.00051-00100 would consist of columns 101 through 200 and rows 51 through 100. The files themselves contain only an array integers, with no associated metadata. The metadata for the dataset is contained in a file called index, which contains a series keyword=value statements telling geogrid the details of the data tiles, such as width, height, the number of bytes per integer, etc. The index file also contains metadata for the data set itself, such as the projection information and physical units.

We have made available a series of scripts and utilities (autoWPS) intended to automate many of the steps required to create a dataset for a WRF real run. A utility program included in this repository, WPSGeoTiff, is a simple command line utility that takes a GeoTiff file and writes converts the data into geogrid binary format. The GeoTiff format was chosen because it is a highly ubiquitous format for geotagged data and it has a cross-platform and open source c library called libGeoTIFF. Most, if not all, GIS application software is capable of converting or exporting a dataset as a GeoTiff file.

Prerequisites

The WPSGeoTiff utility requires two external libraries in order to compile. You should pick a location to install the libraries, for instance PREFIX=${HOME}/opt.

  • libTIFF The following commands are sufficient to install libTIFF for most systems.
./configure --prefix=$PREFIX
make
make install
  • libGeoTIFF A similar set of commands install libGeoTIFF, but we must tell the configure script were we put libTIFF.
./configure --prefix=$PREFIX --with-libtiff=$PREFIX
make
make install

Obtaining the WPSGeoTiff source

The autoWPS package is available as a git repository hosted at github. The repository can be cloned with the git command:

git clone git://github.com/jbeezley/autoWPS.git 

A tarball of the latest release can also be obtained at http://github.com/jbeezley/autoWPS/tarball/master.

Compiling WPSGeoTiff

Inside the main directory of the autoWPS release is a subdirectory called WPSGeoTiff. This utility is written in pure c and creates a standalone binary. In most cases it can be compiled by issuing the following command.

LIBTIFF=$PREFIX GEOTIFF=$PREFIX make

The main binary convert_geotiff.x, should now reside in the current directory.

Running WPSGeoTiff

Running convert_geotiff.x with no arguments will produce a usage description.

Usage: ./convert_geotiff.x [OPTIONS] FileName

Converts geotiff file `FileName' into geogrid binary format
into the current directory.

Options:
-h         : Show this help message and exit
-c NUM     : Indicates categorical data (NUM = number of categories)
-b NUM     : Tile border width (default 3)
-w [1,2,4] : Word size in output in bytes (default 2)
-z         : Indicates unsigned data (default FALSE)
-t NUM     : Output tile size (default 100)
-s SCALE   : Scale factor in output (default 1.)
-m MISSING : Missing value in output (default 0., ignored for categorical data)
-u UNITS   : Units of the data (default "NO UNITS")
-d DESC    : Description of data set (default "NO DESCRIPTION")

All of the files will be created in the current directory, so it is best to run the program from an empty directory. A more detailed description of the arguments to this program follows.

  • -b

The data tiles in the geogrid binary format are allowed to overlap by a fixed number of grid points. The extra border around the tile is called the halo, and this argument sets the width of the halo. For instance with a halo of size three, the file named 00101-00200.00051-00100 would actually contain columns 98-203 and rows 48-103 of the full dataset. This halo is necessary for the interpolation scheme inside of WPS. The default should be acceptable for most situations.