Opened 17 months ago

Closed 17 months ago

Last modified 17 months ago

#5392 closed defect (invalid)

Aggregate ST_Union fails to join some "circles" when radius is too large - lwgeom_unaryunion_prec: GEOS Error: TopologyException

Reported by: rotten Owned by: pramsey
Priority: medium Milestone: PostGIS 3.3.4
Component: postgis Version: 3.3.x
Keywords: Cc:

Description

PostGIS 3.3.2 on PostgreSQL 14.7

Using ST_Buffer to create two "Circles" from arbitrary points on the globe, and then using ST_Union as an aggregate function to combine them will fail when the radius exceeds some value (slightly different depending on where the points are on the globe).

I've tried to make these examples as simple as I can.

For example, this case works:

-- radius <= 290085 good
with kpc as (
    select
        ST_Buffer(
            ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
            290085::integer
        )::geometry as location
),
tonga as (
    select
        ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290085::integer
        )::geometry as location
),
all_circles as (
    select location from kpc
    union all
    select location from tonga
),
joined_circles as (
    select
        ST_Union(location) as circle_pair
    from
        all_circles
)
select
    290085::integer as radius,
    ST_Distance(
        ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
        ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography
    ) as kpc_tonga_distance,
    ST_AsEWKT(circle_pair)
from
    joined_circles
;

However, if I increase the radius one more meter, it fails:

-- radius >= 290086 bad
with kpc as (
    select
        ST_Buffer(
            ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
tonga as (
    select
        ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
all_circles as (
    select location from kpc
    union all
    select location from tonga
),
joined_circles as (
    select
        ST_Union(location) as circle_pair
    from
        all_circles
)
select
    290086::integer as radius,
    ST_Distance(
        ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
        ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography
    ) as kpc_tonga_distance,
    ST_AsEWKT(circle_pair)
from
    joined_circles
;

This is the error message:

psql:st_union_issue.sql:77: ERROR: lwgeom_unaryunion_prec: GEOS Error: TopologyException: side location conflict at -174.563034509468 -18.071220344085667. This can occur if the input geometry is invalid.

Even though I know that ST_Buffer emitted valid geometries (I was able to check this with ST_IsValid), if I add a ST_MakeValid to the query that just failed, it works:

-- ST_MakeValid fixes it, even though the individual circles are valid
with kpc as (
    select
        ST_Buffer(
            ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
tonga as (
    select
        ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
all_circles as (
    select location from kpc
    union all
    select location from tonga
),
joined_circles as (
    select
        ST_Union(ST_MakeValid(location)) as circle_pair
    from
        all_circles
)
select
    290086::integer as radius,
    ST_Distance(
        ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
        ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography
    ) as kpc_tonga_distance,
    ST_AsEWKT(circle_pair)
from
    joined_circles
;

Interestingly, if I don't use ST_Union as an Aggregate function, I don't need ST_MakeValid to get the joined circles to work.

-- non aggregate union fixes it
with kpc as (
    select
        ST_Buffer(
            ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
tonga as (
    select
        ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290086::integer
        )::geometry as location
),
joined_circles as (
    select
        ST_Union(kpc.location, tonga.location) as circle_pair
    from
        kpc, tonga
)
select
    290086::integer as radius,
    ST_Distance(
        ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
        ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography
    ) as kpc_tonga_distance,
    ST_AsEWKT(circle_pair)
from
    joined_circles
;

Note that the circles are not yet overlapping when they fail. It also doesn't seem to matter if they cross the international date line. Radius seems to be the key determinant as to when a joined pair of circles will start failing.

It also doesn't matter how many segments I add to the circles. The radius failing threshold for any two points remains the same.

Change History (9)

comment:1 by robe, 17 months ago

Milestone: PostGIS 3.3.3PostGIS 3.3.4

comment:2 by robe, 17 months ago

Thanks for the self-contained examples. I am able to replicate this issue too and get

ERROR:  lwgeom_unaryunion_prec: GEOS Error: TopologyException: side location conflict at -174.563034509468 -18.071220344085667. This can occur if the input geometry is invalid. 

Regarding your email note about Union aggregate use being rare, it's probably one of the most commonly used functions in PostGIS. Sadly this problem is one of the most common people run across, and GEOS project is working hard to remove instances of this.

POSTGIS="3.4.0dev 3.3.0rc2-955-g4ace89312" [EXTENSION] PGSQL="160" GEOS="3.12.0dev-CAPI-1.18.0" PROJ="8.2.1" LIBXML="2.9.14" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.5.0 (Internal)"

My geos 3.12 is a couple of weeks old, so I'll need to check on this to see if newer has a fix for it.

This is what Martin on GEOS and PostGIS teams) likes to call a "robustness" issue.

Can you output what your below returns

SELECT postgis_full_version();

Ultimately I think this is something that needs to be fixed on the GEOS side.

comment:3 by robe, 17 months ago

Summary: Aggregate ST_Union fails to join some "circles" when radius is too largeAggregate ST_Union fails to join some "circles" when radius is too large - lwgeom_unaryunion_prec: GEOS Error: TopologyException

comment:4 by rotten, 17 months ago

select postgis_full_version();
                                                                      postgis_full_version
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="3.3.2 0" [EXTENSION] PGSQL="140" GEOS="3.10.3-CAPI-1.16.1" PROJ="9.1.0" LIBXML="2.9.9" LIBJSON="0.12.99" LIBPROTOBUF="1.3.0" WAGYU="0.5.0 (Internal)"
(1 row)

comment:5 by pramsey, 17 months ago

Resolution: invalid
Status: newclosed

This is less interesting than it appears to be on first inspection (but it is still pretty interesting). Because the SQL starts by taking buffers of points, which intuitively should be easy-to-process circles, the result is surprising to our eyes.

However, taking the SQL apart into something a little more step wise, the real problem is just the normal boring one of feeding invalid inputs to the overlay functions.

create table polys (name text, geom geometry);

insert into polys values ('kpc', ST_Buffer(
            ST_Point(-166.9324011::float8, 65.154889::float8, 4326)::geography,
            290086::integer
        )::geometry);

insert into polys values ('tonga', ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290086::integer
        )::geometry);

select name, st_isvalid(geom) from polys;

The main thing to note is that taking large buffers of objects in geography may yield invalid outputs, since the buffer function in geography is a nasty hack. In geometry of course a buffer of a point is a circle and never invalid, but that's not what is happening here.

There is still an "interesting" bit left, and that is is that the pairwise and the aggregate versions of union are returning different results.

select st_union(geom) from polys;

select st_union(a.geom, b.geom)
  from polys a
  cross join polys b
  where a.name = 'kpc' and b.name = 'tonga';

If you backtrace from inside the overlay code where that TopologyException is thrown, you'll find that the process is actually going through different code lines. The aggregate goes through the cascaded union, which is very complex, while the pairwise just goes straight into the overlay code.

If you make valid inputs, you can get a result from both the aggregate and pairwise union code lines.

insert into polys 
  select name || ' valid', st_makevalid(geom) as geom
  from polys
  where name in ('kpc', 'tonga');

select st_area(st_union(geom)) from polys where name ~ 'valid';

select st_area(st_union(a.geom, b.geom)) 
  from polys a
  cross join polys b
  where a.name = 'kpc valid' 
  and b.name = 'tonga valid'

Note however, that if you compare the result of the pairwise union of the invalid inputs to the result for the valid inputs, the answer is wrong.

select st_area(st_union(a.geom, b.geom)) 
  from polys a
  cross join polys b
  where a.name = 'kpc' 
  and b.name = 'tonga'

So the moral of the story is not "pairwise union is more resistant to invalid inputs than aggregate union", it is "invalid inputs can fail in different ways through different code lines, so you really want to avoid invalid inputs".

I sometimes wonder if we should do automagical makevalid as part of the try/catch block on the overlay code, so the case of the TopologyException would end up actually working, automagically (catch exception, make valid, retry). But seeing the silent failure of the pairwise code makes me question that. Automagical makevalid would make those cases of silent failure even more likely to just slide by people's notice.

Last edited 17 months ago by pramsey (previous) (diff)

comment:6 by robe, 17 months ago

Yah I don't think we should force a make valid. The other issue with make valid aside from it adding overhead is sometimes the geometry that comes out of it is not what you wanted and when you start unioning something that is not what you thought you had, bad things happen. You end up with geometry types you weren't expecting yada yada. I think it's best to just leave it up to the user and have it fail so they know something is wrong.

comment:7 by rotten, 17 months ago

If I understand this, something is wrong when we increase the radius on ST_Buffer by one meter. But less than that, and the output is valid. Shouldn't ST_Buffer fail or emit some sort of warning instead of ST_Union? Or do we always have to check our ST_Buffer output for validity because we'll never know ahead of time where it is going to fail?

I'm still not clear as to why that particular threshold is an invalid input for that location on earth, and why other locations have different radius thresholds. Obviously we need to catch or prevent this from happening rather than waiting until something downstream fails.

I could have sworn I did an ST_IsValid check on this example before submitting it. I apologize for not catching that. It really seems like ST_Buffer should emit some sort of "Invalid Input - buffer overflow - creating an invalid geometry" output if it is going to emit garbage that we can't use downstream without us needing to check it every time we use it.

comment:8 by robe, 17 months ago

@rotten,

ST_Buffer for geometry is generally fine it's the transforms that ST_Buffer(geography..) is doing that converts your valid geometry to an invalid one. There are so many operations that can result in converting a valid geometry to an invalid one that we don't bother checking. The exercise is left up to the user.

The issue is with ST_Buffer(geography ...). As noted https://postgis.net/docs/ST_Buffer.html it does a transform (at least 2) under the hood to guess at an appropriate spatial ref for your data and then convert what it came up with to the 4326 you started with. If your radius is big enough it goes beyond what one can pretend is flat.

Note how the below is invalid

    select
       ST_Isvalid( ST_Buffer(
            ST_Point(-177.2519031::float8, -18.5973127::float8, 4326)::geography,
            290086::integer
        )::geometry)

If you look under the hood of ST_Buffer(geography... is doing)

You will see the hack that pramsey is referring to:

it's determining that this hidden srid 999101 is the best meter based spatial ref sys for the point you presented

SELECT _ST_BestSRID(ST_Point(-177.2519031::float8, -18.5973127::float8, 4326));

-- now take that and buffer it by 290,086 meters -- still valid right yeh

SELECT ST_IsValid(ST_Buffer(ST_Transform(ST_Point(-177.2519031::float8, -18.5973127::float8, 4326), 999101),290086)) ;

Now try to transform it back to 4326, it's invalid

SELECT ST_IsValid(ST_Buffer(ST_Transform(ST_Point(-177.2519031::float8, -18.5973127::float8, 4326), 999101),290086)) ;

I know it's sad, at a certain point our pretend that the world is flat goes hay wire when we try to convert it to another flat representation and the bounds of our flatness stretch beyond what the flat world we picked is suitable and so the transformation function does its best for and every flat world eventually reduces to a wraparound

in reply to:  8 comment:9 by rotten, 17 months ago

Replying to robe:

That is really interesting, thanks for the additional feedback.

Note: See TracTickets for help on using tickets.