Proposal for clustering functions
geometry[] ST_ClusterIntersecting(geometry geom)
Aggregate function returning an array of GeometryCollection
s representing the connected components of a set of geometries.
- accepts
[Multi]Point
,[Multi]LineString
,[Multi]Polygon
geometries of any type that can be converted into GEOS (I can't think of a situation where[Multi]Point
would be useful, but that doesn't mean there isn't one...) - return a geometry array (my current implementation returns a
GeometryCollection
, but the recursive semantics ofST_Dump
then undo all of the hard work)
Example: if run on a table containing all of the LineString
s in the image below, would return an array with two MultiLineString
geometries (red and blue)
geometry[] ST_ClusterWithin(geometry geom, double precision distance)
Aggregate function returning an array of GeometryCollection
s?/MultiPoint
s?, where any component is reachable from any other component with jump of no more than the specified distance.
- like
ST_ClusterIntersecting
, but uses a distance threshold rather than intersection when determining if two geometries should be included in the same component. Could have an implementation very similar toST_ClusterIntersecting
, or could be restricted to points and maybe have a more efficient implementation. - differs from k-means in that a distance is provided, not a number of clusters
Example: In the picture below, an array of five MultiPoint
s would be returned (color-coded). The threshold distance in this case was more than the orange line but less than the pink line.