wiki:DevClusteringFunctions

Version 3 (modified by dbaston, 10 years ago) ( diff )

--

Proposal for two clustering functions:


geometry[] ST_AccumIntersecting(geometry geom)

Aggregate function returning an array of GeometryCollections representing the connected components of a set of geometries.

Example: if run on a table containing all of the LineStrings in the image below, would return an array with two MultiLineString geometries (red and blue)

http://i.stack.imgur.com/WNlxX.png


geometry[] ST_AccumWithinDistance(geometry geom, double precision distance)

Aggregate function returning an array of GeometryCollections?/MultiPoints?, where any component is reachable from any other component with jump of no more than the specified distance.

  • like ST_AccumIntersecting, but uses a distance threshold rather than intersection when determining if two geometries should be included in the same component. Could have an implementation very similar to ST_AccumIntersecting, or could be restricted to points and maybe have a more efficient implementation.
  • differs from kmeans in that a distance is provided, not a number of clusters

Example: In the picture below, an array of five MultiPoints would be returned (color-coded). The threshold distance in this case was more than the orange line but less than the pink line.

http://ibin.co/1oH1ApWCoW8L

Note: See TracWiki for help on using the wiki.