#930 closed task (fixed)
[raster] ST_SummaryStats
Reported by: | Bborie Park | Owned by: | Bborie Park |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.0.0 |
Component: | raster | Version: | master |
Keywords: | history | Cc: |
Description
Like how ST_AsGDALRaster is the backend to ST_AsTIFF, ST_AsJPEG and ST_AsPNG, ST_SummaryStats is the backend for several summary stats:
- Count of the population/sample included in the stats
- Mean (ST_Mean or is ST_Average better?)
- Standard Deviation (ST_StdDev)
- Min/Max (ST_MinMax)
The proposed variations are:
- ST_SummaryStats(rast raster, nband int, ignore_nodata boolean) -> record
returns one record of five columns (count, mean, stddev, min, max)
nband: index of band to process on
ignore_nodata: if TRUE, any pixel who's value is nodata is ignored.
ST_SummaryStats(rast, 2, TRUE)
- ST_SummaryStats(rast raster, nband int) -> record
assumes ignore_nodata = TRUE
ST_SummaryStats(rast, 2)
- ST_SummaryStats(rast raster, ignore_nodata boolean) -> record
assumes band index = 1
ST_SummaryStats(rast, FALSE)
- ST_SummaryStats(rast raster) -> record
assumes band index = 1 and ignore_nodata = TRUE
ST_SummaryStats(rast)
Four approximation functions are also proposed sacrificing some accuracy for speed, especially on large rasters (10000 x 10000).
- ST_SummaryStats(rast raster, nband int, ignore_nodata boolean, sample_percent double precision) -> record
sample_percent: a value between 0 and 1 indicating the percentage of the raster band's pixels to consider when determining the min/max pair.
ST_SummaryStats(rast, 3, FALSE, 0.1) ST_SummaryStats(rast, 1, TRUE, 0.5)
- ST_SummaryStats(rast raster, ignore_nodata boolean, sample_percent double precision) -> record
assumes that nband = 1
ST_SummaryStats(rast, FALSE, 0.01) ST_SummaryStats(rast, TRUE, 0.025)
- ST_SummaryStats(rast raster, sample_percent double precision) -> record
assumes that nband = 1 and ignore_nodata = TRUE
ST_SummaryStats(rast, 0.25)
- ST_SummaryStats(rast raster) -> record
assumes that nband = 1, ignore_nodata = TRUE and sample_percent = 0.1
ST_SummaryStats(rast)
New tickets for ST_Mean and ST_StdDev will be posted next.
Functions that can depend upon the basic stats (ST_Histogram and ST_Quantile) will be proposed later.
Attachments (1)
Change History (7)
follow-up: 2 comment:1 by , 14 years ago
comment:2 by , 14 years ago
Replying to pracine:
"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.
Thanks for the correction. I'll make the appropriate changes to what has been written.
We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.
I'll get to those once I get the simple case complete.
comment:3 by , 13 years ago
Status: | new → assigned |
---|
A set of ST_SummaryStats and ST_ApproxSummaryStats variations for processing coverages:
- ST_SummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean) -> double precision
ST_SummaryStats('tmax_2010', 'rast', 1, FALSE) ST_SummaryStats('precip_2011', 'rast', 1, TRUE)
- ST_SummaryStats(rastertable text, rastercolumn text, nband int) -> double precision
hasnodata is set to FALSE
ST_SummaryStats('tmax_2010', 'rast', 1)
- ST_SummaryStats(rastertable text, rastercolumn text, hasnodata boolean) -> double precision
nband is set to 1
ST_SummaryStats('precip_2011', 'rast', TRUE)
- ST_SummaryStats(rastertable text, rastercolumn text) -> double precision
nband is set to 1 and hasnodata is set to FALSE
ST_SummaryStats('tmin_2009', 'rast')
Variations for ST_ApproxSummaryStats are:
- ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean, sample_percent double precision) -> double precision
ST_ApproxSummaryStats('tmax_2010', 'rast', 1, FALSE, 0.5) ST_ApproxSummaryStats('precip_2011', 'rast', 1, TRUE, 0.2)
- ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, sample_percent double precision) -> double precision
hasnodata is set to FALSE
ST_ApproxSummaryStats('tmax_2010', 'rast', 1, 0.5) ST_ApproxSummaryStats('precip_2011', 'rast', 1, 0.2)
- ST_ApproxSummaryStats(rastertable text, rastercolumn text, hasnodata boolean, sample_percent double precision) -> double precision
nband is set to 1
ST_ApproxSummaryStats('tmax_2010', 'rast', FALSE, 0.5) ST_ApproxSummaryStats('precip_2011', 'rast', TRUE, 0.2)
- ST_ApproxSummaryStats(rastertable text, rastercolumn text, sample_percent double precision) -> double precision
nband is set to 1 and hasnodata is set to FALSE
ST_ApproxSummaryStats('tmax_2010', 'rast', 0.5) ST_ApproxSummaryStats('precip_2011', 'rast', 0.2)
- ST_ApproxSummaryStats(rastertable text, rastercolumn text) -> double precision
nband is set to 1, hasnodata is set to FALSE and sample_percent is set to 0.1
ST_ApproxSummaryStats('tmax_2010', 'rast') ST_ApproxSummaryStats('precip_2011', 'rast')
The mean returned in these functions is a weighted mean of the means of each raster tile. The standard deviation returned is the cumulative standard deviation of all raster tiles.
by , 13 years ago
Attachment: | st_summarystats.patch added |
---|
Incremental patch adding ST_SummaryStats function. ST_Band patch must be merged first.
comment:4 by , 13 years ago
Attached patch for ST_SummaryStats function. ST_SummaryStats is the base function for ST_Mean, ST_StdDev, ST_MinMax, ST_Histogram and ST_Quantile. This patch merges cleanly with r7145.
The patch for ST_Band must be merged first before merging this patch.
comment:5 by , 13 years ago
Keywords: | history added |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
Added in r7148.
comment:6 by , 13 years ago
Milestone: | PostGIS Raster Future → PostGIS 2.0.0 |
---|
"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.
Users can also normally just do ST_SummaryStats(ST_SetBandNoDataValue(rast, NULL)) to get the same result.
We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.
Thanks dustymugs