Opened 9 years ago

Closed 9 years ago

#3428 closed enhancement (fixed)

Feature request: A function to convert a geometry's coordinates into a MultiPoint

Reported by: Mike Taves Owned by: dbaston
Priority: medium Milestone: PostGIS 2.3.0
Component: postgis Version: master
Keywords: Cc:

Description

This feature request is for a function to convert the coordinates from any geometry into a MULTIPOINT geometry. For example:

POINT(30 10) => MULTIPOINT(30 10)
LINESTRING Z (30 10 4,10 30 5,40 40 6) => MULTIPOINT Z (30 10 4,10 30 5,40 40 6)
POLYGON((30 10,40 40,20 40,10 20,30 10)) => MULTIPOINT(30 10,40 40,20 40,10 20,30 10)
POLYGON((35 10,45 45,15 40,10 20,35 10),(20 30,35 35,30 20,20 30))
  => MULTIPOINT(35 10,45 45,15 40,10 20,35 10,20 30,35 35,30 20,20 30)
MULTIPOINT M (10 40 1,40 30 2,20 20 3,30 10 4)
  => MULTIPOINT M (10 40 1,40 30 2,20 20 3,30 10 4)

Similar functionality can be done with existing ST_Collect and ST_DumpPoints functions, e.g.:

CREATE OR REPLACE FUNCTION ST_ToMultiPoint(geometry) RETURNS geometry AS
'SELECT ST_CollectionExtract(ST_Collect(geom), 1)
 FROM (SELECT * FROM ST_DumpPoints($1)) AS dp;' LANGUAGE sql IMMUTABLE;

however this is not optimised and has a dismal performance. An equivalent function in C would be very efficient.

As for a function name, similar to ST_Boundary, it could be ST_Coordinate or similar.

Attachments (1)

callgraph.dot.ps (35.2 KB ) - added by dbaston 9 years ago.
call graph for benchmark query

Download all attachments as: .zip

Change History (11)

comment:1 by dbaston, 9 years ago

On my laptop it takes about 1s to call the ST_ToMultiPoint function on a LineString with 1 million points. Not as fast as it could be, but I'm not sure about dismal. The components of the SQL function are all implemented in C, so there's no glaring performance problem, like PLPGSQL array access, that would be resolved by a direct native implementation.

There's certainly overhead -- a few serialization-deserialization round trips, some overhead from the SRF and aggregate machinery, but I'm not sure how much better a direct implementation would be. Certainly not a hard one to put together, though...

I'll attach a postscript file with the call graph and execution times for the following:

-- foo has one record, a linestring with 1M points
SELECT ST_NPoints(ST_ToMultiPoint(geom)) FROM foo

by dbaston, 9 years ago

Attachment: callgraph.dot.ps added

call graph for benchmark query

comment:2 by dbaston, 9 years ago

A quick C implementation in this case takes about 350ms, relative to 1150ms for the SQL version.

comment:3 by Mike Taves, 9 years ago

Version: 2.2.x

The performance degrades with more geometries (e.g. 500 geometries, each with 1000 points):

DROP TABLE IF EXISTS random_lines;
SELECT id, ST_MakeLine(array_agg(pt)) AS geom
INTO TEMP random_lines
FROM (
  SELECT id, ST_MakePoint(random(), random()) pt
  FROM generate_series(1, 500) id, generate_series(1, 1000)
) r
GROUP BY id;
-- Test 1
WITH rows AS (
  SELECT ST_NPoints(ST_ToMultiPoint(geom)) FROM random_lines)
SELECT count(*) FROM rows;
-- Test 2
WITH rows AS (
  SELECT ST_NPoints(geom) FROM random_lines)
SELECT count(*) FROM rows;

With PostGIS 2.1 I'm comparing 500 ms vs. 30 ms.

(Where I really noticed this issue was with an older PostGIS 1.5 that my work insists on using, where I'm comparing a painful 22021 ms vs. 30 ms, which is dismal, but on an unsupported release.)

comment:4 by Mike Taves, 9 years ago

Version: 2.2.x

comment:5 by dbaston, 9 years ago

Timings on my end for the random_lines example: 400ms for the SQL ST_ToMultiPoint, 130ms for C version. It's only about 40 lines of code, so maybe worth adding. I'll defer to others on that.

comment:6 by strk, 9 years ago

I'm fine for a new C function, but need a good name.

How about ST_Points ? So that ST_Dump(ST_Points(g)) would be equivalent to ST_DumpPoints(g)

comment:7 by dbaston, 9 years ago

Milestone: PostGIS FuturePostGIS 2.3.0
Owner: changed from pramsey to dbaston
Status: newassigned
Version: trunk

ST_Points sounds good to me.

comment:8 by dbaston, 9 years ago

PR at https://github.com/postgis/postgis/pull/85, will merge if no issues.

strk, what do you think of the addition of utility functions to cu_tester.c? Seems like it would be nice share some utility functions among test files to reduce boilerplate.

comment:9 by strk, 9 years ago

+1 for sharing utility code See also the ASSERT macros with no CU_ prefix in cu_tester.h, which you may want to use inside the transformation_test (they add printing the obtained result when it doesn't match expected)

comment:10 by dbaston, 9 years ago

Resolution: fixed
Status: assignedclosed

Committed at r14604

Note: See TracTickets for help on using tickets.