Pg Raster Sql Interface
Public Interfaces into PGRaster:
This document is a set of requirements for the interface associated with a new user-defined type called PGRaster. PGRaster represents two-dimensional arrays of numeric values that are navigated within geographic coordinate systems. PGRaster allows cell values to span both integer and floating points, thus unifying the concepts of geonavigated images and grids into a single data type.
The requirements used "shall" language to promote clarity. However, this language should not be construed as a statement of definitive correctness. Dissenting view points are welcome.
1. Storage format
PGRaster shall be defined as a PostgreSQL data type. The implementation of this data type vary greatly depending upon which storage model is used to represent raster data within the database. See PGRaster Storage Models for details.
PGRaster data shall be stored a binary data in one of several support raster formats. The exact list of supported formats is TBD, but will most likely be determined by the format support available from external packages, such as GDAL or GRASS. The supported formats must be able to provide the following information:
- Raster data itself, i.e . the raster dimensions and the pixel or grid point values
- Value type, i.e. the underlying data type of the raster data, e.g. byte, int16, int32, float32, float64.
- Byte order of the cell values, i.e. big endian or little endian
- Geonavigation, i.e. the location of the raster relative to some spatial reference system
The following information should be extracted from the raster format, if it is present, but does NOT need to be supported by the format.
- The value or values to be treated as a "missing value". This is only relevant for raster data that contains such a concept.
In order to support more optimized access, PGRaster may also store data derived from the formatted raster data. Such derived data includes things like the bounding box of the raster as a PostGIS GEOMETRY. However, such derived data will never be directly input or output with the PGRaster data. PGRaster data shall always be input and output in one of the supported image formats, without additional headers, wrapping protocols, or other auxilliary data.
2. Data insertion and deletion
Insertion of PGRaster data into the database will vary depending on the storage model used to represent the data.
For a bytea/TOAST storage model, PGRaster data shall be inserted into the database using the SQL INSERT statement and providing the properly escaped binary data for the image format. More efficient insertion can be obtained by created a prepared insert statement and using low-level interface functions to provide the binary data separate from the SQL text. In the PostgreSQL C API, PQexecPrepared and PQexecParams provide this low-level functionality; similiar functionality is also found in other client-side APIs.
For a LO storage model, PGRaster data shall be inserted into the database using the appropriate "lo_" functions defined in the PostgreSQL C interface. The binary data must be inserted prior to the referencing PGRaster data record, so that the oid of the large object can be properly associated with the referencing row. Trigger functions can be defined to automatically delete the large object when its referencing row is deleted. For simplicity of the storage model, PGRaster large objects shall not be referenced by more than one data record.
For all storage implementations, PGRaster data shall be removed from storage by deleting the row containing the PGRaster data type. For LO and multitable implementations, trigger functions or an equivalent mechanism shall be used to effect any other deletions that are necessary to completely remove a PGRaster. For bytea/TOAST implementations, deletion of the data with its referencing record will be automatic.
3. Data extraction
The simplest form of data extraction is to retrieve the binary raster data in its native form. For a bytea/TOAST implementation, this requires only the use of SQL SELECT statements; however, for efficiency these should use the binary mode to avoid the need to escape and unescape the binary data. For a LO implementation, the appropriate "lo_" functions would be used.
For more complex or tailored data extraction, a number of SQL functions shall be defined. These functions shall take PGRaster data types as arguments, do manipulations within the database server backend, and return the results to the caller of the SQL function.
Many of the extraction functions accept a textual argument called "options". This argument contains 0 or more parameters in the form "name=value;name2=value2". Arguments are provided in this generic form to allow additional functionality to be added to the functions without changing their signature. An error will be raised if an invalid parameter name or value is specified. Omission of a required parameter name will result in the parameter being set to its default value.
To make it more obvious which functions are related to PGRaster data, the prefix "pgr_" will be added to all function names.
3.0 Data types used in raster extraction
The extraction functions use a data type called PGRasterCell to represent a single cell of a PGRaster object. Here the term "cell" refers to a single pixel or grid box with a raster. For now, the PGRasterCell type will be defined as a domain type, but it may be more appropriate to define it as a user-defined type. This would allow more control over casting behavior and better type checking. However, it would require definition of accessor methods.
CREATE DOMAIN PGRasterCell AS TEXT;
TODO: Determine if the values should be of some type other than TEXT, e.g. double precision numbers, or if PGRasterCell should be its own user-defined type.
The extraction functions require some new data types built on top of the PostGIS GEOMETRY type. These types are needed to associate values extracted from the raster cells with geometries derived from the raster. They are used as inputs and outputs to several of the functions below. These data types are shown here as aggregate types, but it may be better to define them them an user-defined types, so that arrays of these types can also be used.
CREATE TYPE geometry_value AS (
geom geometry,
val PGRasterCell
)
CREATE TYPE geometry_value_range AS (
geom geometry,
lower PGRasterCell,
upper PGRasterCell
)
3.1 Raster information functions
Raster information functions involve extracting metadata about a given raster. Effectively, they are the accessor methods for the PGRaster class.
pgr_get_bbox(PGRaster raster, float sample_factor)
Returns: GEOMETRY
Returns the bounding box for the raster as a PostGIS geometry. The sample_factor is used to determine how many points to use in the bounding box. If sample factor is NULL or 0, the bounding box will be a simple rectangle. If the sample factor is 1.0, the bounding box will have a vertex at each grid point location. Values between 0 and 1 will give intermediate resolution in the number of vertices. Sampling is useful if the bounding box is going to be remapped into a different coordinate system.
pgr_get_type(PGRaster raster)
Returns: TEXT
Returns the name of the underlying data type used to represent cell values in the raster.
pgr_get_xdim(PGRaster raster)
Returns: int
Returns the number of points along the x-dimension in the given PGRaster.
pgr_get_ydim(PGRaster raster)
Returns: int
Returns the number of points along the y-dimension in the given PGRaster.
pgr_get_srid(PGRaster raster)
Returns: int
Returns the identifier for the spatial reference system associated with the given PGRaster.
pgr_get_missing_value(PGRaster raster)
Returns: PGRasterCell
Returns the missing value associated with the raster as a PGRasterCell type, or NULL if the raster does not have a value to represent missing data. The value is returned as the generic PGRasterCell type to allow rasters of all types to return their missing values in a consistent way.
pgr_set_missing_value(PGRaster raster, PGRasterCell miss_val)
Returns: void
Sets the missing value associated with the raster. If miss_val is NULL, the missing value for the raster becomes undefined. It miss_val does not contain a value that is incompatible with the raster's underlying data type, an error is generated.
3.2 Raster creation functions
Raster creation functions involve returning a derived PGRaster as the result. They do not require an PGRaster data as input; rather they create a new PGRaster based on caller-provided parameters.
pgr_create_raster(text data_type, int nx, int ny, text options)
Returns: PGRaster
Creates a new PGRaster with the specified underlying type and dimensions. The values are set to the missing value, which can be overriden using the options.
3.3 Raster to Raster transformation
Raster-to-raster extraction functions involve inputing one or more PGRasters and returning one or more derived PGRaster as the result. They may also take additional data and options as input.
pgr_points_to_raster(PGRaster background, GEOMETRY_VALUE[] geom_values, text options)
Returns: PGRaster
Creates a PGRaster populated by integrating a set of geometry-value pairs into a background raster. The raster created has the same dimensions and type as the background. The technique for merging the geometry-value pairs into the raster can be set using the options, but defaults to a successive correction method. The point-to-raster transformation is also known as objective analysis.
NOTE: This function requires GEOMETRY_VALUE to be a user-defined type, rather than an aggregate type, so it can be used as an array.
pgr_scale_raster(PGRaster base, int nx, int ny, text options)
Returns: PGRaster
Creates a PGRaster that covers the same geographic area as the base raster, but has different dimensions. This is used to make a higher or lower resolution copy of the original. The technique for rescaling may be set in the options. By default, supersamplling is done by pixel replication and subsampling is done by finding the average or median cell value.
pgr_sectorize_raster(PGRaster base, GEOMETRY bbox, text options)
Returns: PGRaster
Creates a PGRaster by finding the all the cells in the base raster that fall within the specified bounding box. The bounding box must have the same SRID as the base raster, and it must be a rectangle.
pgr_subset_raster(PGRaster base, int nx, int ny, GEOMETRY bbox, text options)
Returns: PGRaster
Creates a PGRaster by both scaling and sectorizing the base raster. See the previous two functions for options and limitations.
pgr_merge_raster(PGRaster target, PGRaster sources[], text options)
Returns: PGRaster
Creates a PGRaster using the rasters in sources as base material. The output raster has the dimensions and geolocation of the target raster. This function is used to integrate a set of raster tiles into a single raster, presumably of greater geographic extent. The underlying type of the target and all the sources must be the same.
pgr_remap_raster(PGRaster dest, PGRaster source, text options)
Returns: PGRaster
Creates a PGRaster with structure and SRID given by dest and populates it with cell values from source. Effectively, it transforms the source data into a raster in a different coordinate system.
3.4 Raster to Geometry transformation
Raster-to-Geometry functions involve transformation of PGRaster data into one or more PostGIS geometry objects. Each geometry created from the PGRaster is associated with non-geometry value data extracted from the raster data values. The combined information is represented as one of the data types defined previously.
pgr_raster_to_points(PGRaster raster, GEOMETRY[] geom_array, text options)
Returns: SETOF GEOMETRY_VALUE
Interpolates values from the raster to all the points specified in the geom_array. It returns one record for each array element given. Associated values are set to NULL for points that are not covered by the raster, or for points where the raster value is set to the missing value. The interpolation method defaults to bilinear interpolation for floating point rasters and nearest neighbor for integer rasters, but can be overriden using the options.
pgr_raster_to_lines(PGRaster raster, PGRasterCell[] values, text options)
Returns: SETOF GEOMETRY_VALUE
Contours the raster (i.e. finds lines of constant value) for all values specified in the values array. Values are specified as an array of PGRasterCell type, and must contain values that are compatible with the underlying type of the PGRaster. By default, the contouring algorithms finds crossings of constant values assuming a type of interpolation appropriate to the PGRaster type, e.g. bilinear interpolation for floating points, nearest neighbor for integers, but this can be overriden using the options. Cells set to the missing value are ignored by the algorithm, i.e. contours are not extended into cells with missing values.
pgr_raster_to_polygons(PGRaster raster, PGRasterCell[] values, text options)
Returns: SETOF GEOMETRY_VALUE_RANGE
Creates a "filled contour" from the raster, i.e. finds polygons that surround grid point values within a range of values. The values are specified in the values array, and polygons are generated for each adjacent pair in the array, e.g. between values 0 and 1, 1 and 2, 2 and 3, etc. Thus at least two values must be specified. The algorithm returns the geometry and its upper and lower bounding values; the number of records returns equals the size of the values array minus 1. Values are specified as an array of PGRasterCell type, and must contain values that are compatible with the underlying type of the PGRaster. By default, the contouring algorithms finds crossings of constant values assuming a type of interpolation appropriate to the PGRaster type, e.g. bilinear interpolation for floating points, nearest neighbor for integers, but this can be overriden using the options. Cells set to the missing value are ignored by the algorithm, i.e. contours are not extended into cells with missing values.