How to convert data for Geogrid

From openwfm
Revision as of 04:46, 23 March 2010 by Jbeezley (talk | contribs)
Jump to navigation Jump to search

Running WRF-Fire using real data requires additional datasets not included in the standard WPS input data tarball. For these, it is necessary to convert your source data into the geogrid data format. The geogrid data format consists of a directory of tiled binary files with names indicating the index range contained in each tile. For instance, a file name of 00101-00200.00051-00100 would consist of columns 101 through 200 and rows 51 through 100. The files themselves contain only an array integers, with no associated metadata. The metadata for the dataset is contained in a file called index, which contains a series keyword=value statements telling geogrid the details of the data tiles, such as width, height, the number of bytes per integer, etc. The index file also contains metadata for the data set itself, such as the projection information and physical units.

We have made available a series of scripts and utilities (autoWPS) intended to automate many of the steps required to create a dataset for a WRF real run. A utility program included in this repository, WPSGeoTiff, is a simple command line utility that takes a GeoTiff file and writes converts the data into geogrid binary format. The GeoTiff format was chosen because it is a highly ubiquitous format for geotagged data and it has a cross-platform and open source c library called libGeoTIFF. Most, if not all, GIS application software is capable of converting or exporting a dataset as a GeoTiff file.

Prerequisites

The WPSGeoTiff utility requires two external libraries in order to compile. You should pick a location to install the libraries, for instance PREFIX=${HOME}/opt.

  • libTIFF The following commands are sufficient to install libTIFF for most systems.
./configure --prefix=$PREFIX
make
make install
  • libGeoTIFF A similar set of commands install libGeoTIFF, but we must tell the configure script were we put libTIFF.
./configure --prefix=$PREFIX --with-libtiff=$PREFIX
make
make install

Obtaining the WPSGeoTiff source

The autoWPS package is available as a git repository hosted at github. The repository can be cloned with the git command:

git clone git://github.com/jbeezley/autoWPS.git 

A tarball of the latest release can also be obtained at http://github.com/jbeezley/autoWPS/tarball/master.

Compiling WPSGeoTiff

Inside the main directory of the autoWPS release is a subdirectory called WPSGeoTiff. This utility is written in pure c and creates a standalone binary. In most cases it can be compiled by issuing the following command.

LIBTIFF=$PREFIX GEOTIFF=$PREFIX make

The main binary convert_geotiff.x, should now reside in the current directory.

Running WPSGeoTiff

Running convert_geotiff.x with no arguments will produce a usage description.

Usage: ./convert_geotiff.x [OPTIONS] FileName

Converts geotiff file `FileName' into geogrid binary format
into the current directory.

Options:
-h         : Show this help message and exit
-c NUM     : Indicates categorical data (NUM = number of categories)
-b NUM     : Tile border width (default 3)
-w [1,2,4] : Word size in output in bytes (default 2)
-z         : Indicates unsigned data (default FALSE)
-t NUM     : Output tile size (default 100)
-s SCALE   : Scale factor in output (default 1.)
-m MISSING : Missing value in output (default 0., ignored for categorical data)
-u UNITS   : Units of the data (default "NO UNITS")
-d DESC    : Description of data set (default "NO DESCRIPTION")

All of the files will be created in the current directory, so it is best to run the program from an empty directory. A more detailed description of the arguments to this program follows.

  • -b

The data tiles in the geogrid binary format are allowed to overlap by a fixed number of grid points. The extra border around the tile is called the halo, and this argument sets the width of the halo. For instance with a halo of size three, the file named 00101-00200.00051-00100 would actually contain columns 98-203 and rows 48-103 of the full dataset. This halo is necessary for the interpolation scheme inside of WPS. The default should be acceptable for most situations.

  • -w

The number of bytes to represent each data point as an integer. These integers are scaled by the scaling parameter before being truncated to an integer. scaledA lower value will make the output data smaller, at the cost of accuracy or the dynamic range of the input.

  • -m

Any grid point that is missing data, such as the outer border of the edge tiles, or grid points that the GeoTIFF file indicates as missing will be set to this value. This argument is currently ignored when the categorical flag is set, instead missing data will be set to the maximum category + 1.

  • -s

Because the data is always stored as an integer, a scaling parameter is needed to represent fractional numbers or large values. The data set will be divided by this number prior to being truncated to an integer. If the data set has an accuracy of 2 decimal places, a reasonable scale to use would be 0.01.

  • -u, -d

The units and a small description of the data set should be included as arguments. Multi-word arguments should be quoted as follows.

-u meters -d "elevation above sea level"