Opened 14 years ago

Closed 13 years ago

#934 closed enhancement (fixed)

[raster] ST_Histogram and ST_ApproxHistogram

Reported by: Bborie Park Owned by: Bborie Park
Priority: medium Milestone: PostGIS 2.0.0
Component: raster Version: master
Keywords: history Cc:

Description

ST_Histogram and ST_ApproxHistogram provide methods to determine a raster's data distribution.

ST_Histogram returns the data by absolute numbers while ST_ApproxHistogram returns the data by proportion due to the data used in ST_ApproxHistogram is a sample of the raster.

The return of ST_Histogram and ST_ApproxHistogram is a set of records where each record is (min, max, count).

ST_Histogram should have the following variations.

  1. ST_Histogram(rast raster, nband int, hasnodata boolean, bins int, width double precision[], right boolean) -> set of records

returns set of records of three columns (bin min, bin max, bin count)

nband: index of band to process on

hasnodata: if FALSE, any pixel who's value is nodata is ignored.

bins: the number of categories/bins to have in the histogram. If NULL or value less than one, the number of categories will be auto-computed using Sturges' formula if the number of values >= 30 or Square-root choice if number of values < 30.

http://en.wikipedia.org/wiki/Histogram#Mathematical_definition

width: an array indicating the width of each category/bin. If the number of bins is greater than the number of widths, the widths are repeated. Example: 9 bins, widths are [a, b, c] will have the output be [a, b, c, a, b, c, a, b, c].

right: compute the histogram from the right rather than from the left (default). This changes the criteria for evaluating a value x from [a, b) to (a, b].

ST_Histogram(rast, 2, FALSE, NULL, NULL, FALSE)

ST_Histogram(rast, 1, TRUE, 100, NULL, FALSE)

ST_Histogram(rast, 2, FALSE, NULL, ARRAY[100, 50], FALSE)

ST_Histogram(rast, 2, FALSE, 9, ARRAY[1000], TRUE)

ST_Histogram(rast, 2, FALSE, 20, ARRAY[100, 200, 300], TRUE)
  1. ST_Histogram(rast raster, nband int, hasnodata boolean, bins int, right boolean) -> set of records

parameter "width" is not specified thus resulting in all bins having the same widths

ST_Histogram(rast, 2, FALSE, 5, FALSE)
  1. ST_Histogram(rast raster, nband int, hasnodata boolean, bins int) -> set of records

the parameter "right" is removed and assumed to be FALSE

ST_Histogram(rast, 2, FALSE, 5)
  1. ST_Histogram(rast raster, nband int, hasnodata boolean) -> set of records

the parameter "bins" is removed and set to NULL so that function can compute the number of bins to use

  1. ST_Histogram(rast raster, nband int) -> set of records

parameter "hasnodata" is removed and assumed to be FALSE

  1. ST_Histogram(rast raster) -> set of records

assumes that nband is 1.

ST_ApproxHistogram should have the following variations.

  1. ST_ApproxHistogram(rast raster, nband int, hasnodata boolean, sample_percent double precision, bins int, width double precision[], right boolean) -> set of record

sample_percent: a value between 0 and 1 indicating the percentage of the raster band's pixels to consider when generating the histogram.

ST_Histogram(rast, 2, FALSE, 0.1, NULL, NULL, FALSE)

ST_Histogram(rast, 1, TRUE, 1, 100, NULL, FALSE)

ST_Histogram(rast, 2, FALSE, 0.2, NULL, ARRAY[100, 50], FALSE)

ST_Histogram(rast, 2, FALSE, 0.25, 9, ARRAY[1000], TRUE)

ST_Histogram(rast, 2, FALSE, 0.05, 20, ARRAY[100, 200, 300], TRUE)
  1. ST_ApproxHistogram(rast raster, nband int, hasnodata boolean, sample_percent double precision, bins int, right boolean) -> set of records

parameter "width" is not specified thus resulting in all bins having the same widths

  1. ST_ApproxHistogram(rast raster, nband int, hasnodata boolean, sample_percent double precision, bins int) -> set of records

the parameter "right" is removed and assumed to be FALSE

ST_ApproxHistogram(rast, 2, FALSE, 5)
  1. ST_ApproxHistogram(rast raster, nband int, hasnodata boolean, sample_percent double precision) -> set of records

the parameter "bins" is removed and set to NULL so that function can compute the number of bins to use

  1. ST_ApproxHistogram(rast raster, nband int, sample_percent double precision) -> set of records

parameter "hasnodata" is removed and assumed to be FALSE

  1. ST_ApproxHistogram(rast raster, nband int) -> set of records

assumes that sample_percent is 0.1

  1. ST_ApproxHistogram(rast raster, sample_percent double_precision) -> set of records

assumes that nband is 1

  1. ST_ApproxHistogram(rast raster) -> set of records

assumes that nband is 1 and sample_percent is 0.1

Attachments (1)

st_histogram.patch (28.3 KB ) - added by Bborie Park 13 years ago.
Adds ST_Histogram support.

Download all attachments as: .zip

Change History (9)

by Bborie Park, 13 years ago

Attachment: st_histogram.patch added

Adds ST_Histogram support.

comment:1 by Bborie Park, 13 years ago

Adds ST_Histogram function, which requires ST_SummaryStats. Merges cleanly against r7145.

The following patches must be merged first for this patch to merge cleanly:

  1. ST_Band
  1. ST_SummaryStats
  1. ST_Mean
  1. ST_StdDev
  1. ST_MinMax

comment:2 by Bborie Park, 13 years ago

Owner: changed from pracine to Bborie Park
Status: newassigned

comment:3 by Bborie Park, 13 years ago

Keywords: history added
Resolution: fixed
Status: assignedclosed

Added in r7152

comment:4 by Bborie Park, 13 years ago

Milestone: PostGIS Raster FuturePostGIS 2.0.0

comment:5 by robe, 13 years ago

Bborie, Is there any particular reason (aside from summarystats can be used for full coverages and has no breakouts so more likely to exceed integer), that you decided to make

summarystats count bigint - http://www.postgis.org/documentation/manual-svn/summarystats.html

and histogram count integer - http://www.postgis.org/documentation/manual-svn/histogram.html

I would just assume make both bigint.

comment:6 by Bborie Park, 13 years ago

Resolution: fixed
Status: closedreopened

Regina, thanks for catching that. The histogram one should be a bigint. I'll make the change to bigint.

I'm reopening this ticket as I'll be adding the ability to process a coverage table next. Just a heads up...

comment:7 by Bborie Park, 13 years ago

Added coverage table support in r7339. I'll document the added functions once I fix a different bug.

comment:8 by Bborie Park, 13 years ago

Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.