#2422 closed defect (fixed)
geography regression difference ST_DWithin
Reported by: | robe | Owned by: | pramsey |
---|---|---|---|
Priority: | high | Milestone: | PostGIS 2.2.0 |
Component: | postgis | Version: | 2.1.x |
Keywords: | Cc: |
Description (last modified by )
I presume geography in 2.0.4 and goegraphy in 2.1 should return the same answer. In benchmarking client's data I found something slightly troubling:
SELECT COUNT(*) FROM face WHERE ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 );
when I run this on
POSTGIS="2.0.4SVN r11708" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 64-bit count ------- 397 (1 row) Time: 157.613 ms
On
POSTGIS="2.1.0rc2 r11726" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 Mar ch 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN " RASTER PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 64-bit count ------- 395 (1 row) Time: 218.903 ms
whammo I got 2 geometries less and lost a good chunk of speed.
I ran a couple of times on each database with similar timings.
I'll try to get permission from client to release this datasource, but in mean time I'll try to narrow down to the offending geometries.
Attachments (1)
Change History (46)
comment:1 by , 11 years ago
Description: | modified (diff) |
---|
comment:2 by , 11 years ago
comment:3 by , 11 years ago
You mean the distance ones and not the _ST_DWithinUncached ones? I'll try both.
comment:4 by , 11 years ago
Running this test
SELECT COUNT(*) FROM face WHERE _ST_DWithinUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 );
POSTGIS="2.1.0rc2 r11726" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 Mar ch 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN " RASTER PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 64-bit count ------- 397 (1 row) Time: 161.412 ms
Looks closer to what I'd expect.
On the 2.0.4 no such functions exist so can't compare.
No such thing as _ST_DWithinTree, so closest which takes painfully long to run is
SELECT COUNT(*) FROM face WHERE _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) <= 1609.344;
I'm still waiting for that one to finish
comment:5 by , 11 years ago
okay 109953 ms later
SELECT COUNT(*) FROM face WHERE _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) <= 1609.344;
yields:
count ------- 779 (1 row)
comment:6 by , 11 years ago
hmm even using _ST_DistanceTree combo returns the right answer with ST_Expand:
SELECT COUNT(*) FROM face WHERE geog && _ST_Expand(st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344) AND _ST_Expand(geog,1609.344) && st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography AND _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) < 1609.344;
Returns:
count ------- 397 (1 row) Time: 233.620 ms
So something must be fishy with the way _ST_DWithin is coded I'm trying
SELECT COUNT(*) FROM face WHERE ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 );
After 112 seconds of patient waiting and it finally returned with a presumably WRONG answer of 395.
comment:7 by , 11 years ago
oops I meant the query I ran on PostGIS 2.1 instance was:
ELECT COUNT(*) FROM face WHERE _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 );
which returned:
count ------- 395 (1 row) Time: 110076.840 ms
comment:8 by , 11 years ago
Let me try that once more -- my memory buffer is giving me problems:
SELECT COUNT(*) FROM face WHERE _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true );
yields this:
count ------- 395 (1 row) Time: 110076.840 ms
thus conclusion _ST_DWithin(geography, geography,dist, use_sphere) has got some issues.
I vaguely remember having this issue with geometry _ST_DWithin/ST_Distance you fixed. I'll try to dig up the ticket for that one.
comment:9 by , 11 years ago
Summary: | geography regression difference → geography regression difference ST_DWithin |
---|
comment:11 by , 11 years ago
No, simpler, just
SELECT someId, _ST_DistanceUnCached(), _ST_DistanceTree() FROM yourtable WHERE _ST_DistanceUnCached() != _ST_DistanceTree();
See? The problem is the algorithms are returning different answers for a very few example geographies, and this will both isolate the ones that are failing and give a hint what the different answers are.
comment:12 by , 11 years ago
Well doing this:
SELECT id, _ST_DistanceUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) , _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) FROM face WHERE _ST_Expand(geog,1609.344) && st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography AND _ST_DistanceUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) != _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography);
returns 0 records. Doing a check without ST_Expand is going to take a bit longer since I think they are big geometries and 300,000 of them.
comment:13 by , 11 years ago
Okay check without ST_Expand:
id | _st_distanceuncached | _st_distancetree --------+----------------------+------------------ 330881 | 2811307.65431338 | 2811411.63144931 (1 row)
comment:14 by , 11 years ago
I ran this test on my isolated dataset that is within the region of the error and this shows the geographies in question I think:
SELECT id FROM face WHERE _ST_DWithinUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 ) AND NOT _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true );
Yields:
id -------- 212164 12301
comment:15 by , 11 years ago
Hmm it might not be specific to geometry. If I put these 2 bad geometries in a table called bad_face2 and run this:
SELECT id , _ST_DistanceTree(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) , _ST_DistanceUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography) FROM bad_face2 WHERE NOT _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ) AND _ST_DWithinUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 );
I get this:
id | _st_distancetree | _st_distanceuncached ------+------------------+---------------------- 12301 | 1398.40705751386 | 1398.40705751386
I get only one back.
comment:16 by , 11 years ago
I think the ST_Distance cache is fine.
I created this function which uses _ST_Distance instead of _ST_DWithin. I frankly don't see why we even bother creating an _ST_DWithin (geography_dwithin) when the distance function takes a tolerance.
CREATE OR REPLACE FUNCTION st_dwithin2(geography, geography, double precision, boolean) RETURNS boolean AS 'SELECT $1 && _ST_Expand($2,$3) AND $2 && _ST_Expand($1,$3) AND _st_distance($1, $2, $3, $4) <= $3' LANGUAGE sql IMMUTABLE COST 100;
I run this query:
SELECT count(id) FROM face WHERE ST_DWithin2(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ); ;
And voila -- get my expected answer:
count ------- 397
Anyway I would like this issue resolved today so I can release today otherwise I'll have to weigh and maybe release tomorrow. In meantime I'll finish doing garden regress against 2.0.4 and hopefully I won't find any other issues.
comment:17 by , 11 years ago
The answer regarding why have a dwithin code line is because the distance code line doesn't actually use the tolerance when evaluating two trees against one another, while the dwithin code line does. The idea is the dwithin line can short circuit out of the evaluation when it hits something within tolerance, while the distance calculation has to check all possibilities to get the right distance. However, the distance code line does take in a tolerance number, it just mostly ignores it.
What this case is showing is a failure in the tolerance-based code, perhaps, although you also have a case that shows the tree and segment based distance calculations coming up different.
comment:18 by , 11 years ago
food for thought with 9.3, you can make it a materialized view as we describe here:
comment:19 by , 11 years ago
Oops I wish I could take comments back. I just realized I posted the wrong comment to this ticket. Meant to post to another. Ignore the last one.
comment:20 by , 11 years ago
Milestone: | PostGIS 2.1.0 → PostGIS 2.1.1 |
---|
comment:21 by , 11 years ago
The point of my previous notes on finding the geometries that provided difference results for tree and brute-force distance calculations was to actually get my hands on the geometries, so I could fix the differences. You need to provide me data, or nothing is going to happen on this, it's not like there's a big bold error in the code, there's a niggly tiny wart somewhere deeply hidden that only intense study with replicable data is going to uncover.
If you think the problem is a difference between the dwithin distance distance code lines, give me (a) a SQL statement and (b) a table the demonstrates it. Try to keep it simple, keep it small.
comment:22 by , 11 years ago
I know I know. Don't worry about this one. I'm a subcontractor on this one and haven't had much luck moving up the chain of approval (it's a big company and everyone evidentally has higher priorities). I will retest after all your recent changes and see if it's still an issue and relook at those geometries.
This table I realized its a weird one because it has a mix of multipolygons and geometry collections.
comment:23 by , 11 years ago
actually never mind the geometry collection bit (its a mix of polygons and geometry collections). If my 2 geometries that failed this
SELECT gid, geog FROM face WHERE NOT _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ) AND _ST_DWithinUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 )
and create a smaller only 2 row table with it. Then only one fails this time but putting the check in the from yield2 true so definitely seems like a caching bug of sorts.
SELECT gid, geog, _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ) FROM face2 WHERE NOT _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ) AND _ST_DWithinUnCached(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344 )
Yields:
gid | _st_dwithin ----+------------- 2 | t
SELECT gid, _ST_DWithin(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography, 1609.344,true ) , ST_Distance(geog, st_setsrid(st_makepoint(-71.0636, 42.3584), 4326)::geography ) FROM face2; gid | _st_dwithin | st_distance -----+-------------+---------------- 1 | t | 1398.407057514 2 | f | 1001.9245264
Berry berry interesting. What's kinda bizarre to me is both records ARE within distance so even caching wise, how could it ever have a cache that has false in it.
I still need to upgrade this db to test again. This is running with released 2.1.0
comment:24 by , 11 years ago
I should add that this artifact requires at least 2 records to exhibit the failure. But maybe that's because your caching logic doesn't try to kick in. So adding a gid = 2 it doesn't fail.
comment:25 by , 11 years ago
Okay I've got another that exhibits the same bug. Try attached -- it also fails on latest PostGIS 2.2. If I change any of the geometries ever so slightly it doesn't fail
See attached file
SELECT gid, _ST_DWithin(geog, tgeog, 1609.344,true ), ST_NPoints(geog::geometry) , ST_Distance(geog, tgeog) FROM dwithgeogbug CROSS JOIN ST_GeogFromText('POINT(-69.83262 43.43636)') As tgeog;
by , 11 years ago
Attachment: | dwithinbug_data.sql added |
---|
comment:26 by , 11 years ago
Milestone: | PostGIS 2.1.1 → PostGIS 2.2.0 |
---|
comment:27 by , 11 years ago
Milestone: | PostGIS 2.2.0 → PostGIS 2.1.2 |
---|
comment:28 by , 11 years ago
Hey I think this might be the same issue I am having here: This is a really sweet short example this guy provided to reproduce:
http://lists.osgeo.org/pipermail/postgis-users/2014-January/038619.html
email message added below. I was able to reproduce issue on my windows 7 64-bit running POSTGIS="2.2.0dev r12204" GEOS="3.4.2-CAPI-1.8.2 r3924" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER
Hello, I am using Ubuntu 12.04 with the official PostgreSQL apt repo (via https://wiki.postgresql.org/wiki/Apt). I am running into an easily reproducible issue, and was hoping for some help to solve this. When using ST_Intersects() not all rows that intersect are returned. This was not the case in previous versions that we have upgraded from. These are the steps to reproduce on a fresh install of Ubuntu 12.04 with all packages updated and PostGIS/PostgreSQL 9.3 installed: test=# CREATE TABLE test (id serial, condition_geo geography); CREATE TABLE test=# INSERT INTO test (condition_geo) VALUES (ST_Buffer(ST_GeogFromWKB(ST_MakePoint(20.0,30.0)),10.0)); INSERT 0 1 test=# SELECT id FROM test WHERE ST_Intersects("condition_geo", ST_Buffer(ST_GeogFromText('POINT(20.0 30.0)'), 20.0)) IS TRUE; id ---- 1 (1 row) test=# INSERT INTO test (condition_geo) VALUES (ST_Buffer(ST_GeogFromWKB(ST_MakePoint(20.0,30.0)),10.0)); INSERT 0 1 test=# SELECT id FROM test WHERE ST_Intersects("condition_geo", ST_Buffer(ST_GeogFromText('POINT(20.0 30.0)'), 20.0)) IS TRUE; id ---- 1 (1 row) test=# SELECT id FROM test WHERE ST_Intersects("condition_geo", ST_Buffer(ST_GeogFromText('POINT(20.0 30.0)'), 20.0)) IS TRUE AND id = 2; id ---- 2 (1 row) Note that the SELECT should return both rows 1 and 2 in the first SELECT. Any thoughts?
comment:29 by , 11 years ago
Summary: | geography regression difference ST_DWithin → geography regression difference ST_DWithin and ST_Intersects |
---|
comment:31 by , 11 years ago
Summary: | geography regression difference ST_DWithin and ST_Intersects → geography regression difference ST_DWithin, ST_Intersects, ST_Distance |
---|
Not sure if it helps -- but definitely something screwy with distance cache -- observe using the example above:
SELECT id , ST_Distance(condition_geo, ST_Buffer(ST_GeogFromText('POINT(20.0 30.0)'), 20.0)) from test; id | st_distance ----+------------- 1 | 0 2 | 9.954690252 (2 rows)
and only picking one of course which means it can't use the cache :)
SELECT id , ST_Distance(condition_geo, ST_Buffer(ST_GeogFromText('POINT(20.0 30.0)'), 20.0)) from test where id = 2; id | st_distance ----+------------- 2 | 0
comment:32 by , 11 years ago
Summary: | geography regression difference ST_DWithin, ST_Intersects, ST_Distance → geography regression difference ST_DWithin |
---|
I have concluded this is not the same issue (at least not exactly) as #2556 . This one seems particular to _ST_DWithin and in this case the distance calcs are fine. Still might be related though.
comment:33 by , 11 years ago
Some sort of floating point abberation with converting to radians:
SELECT gid, _ST_DWithin(geog, tgeog, 1600,true ) , ST_Distance(geog, tgeog) , _ST_DistanceUnCached(geog, tgeog) As uncached_dist FROM dwithgeogbug CROSS JOIN ST_GeogFromText('POINT(-69.83262 43.43636)') As tgeog; gid | _st_dwithin | st_distance | uncached_dist -----+-------------+----------------+------------------ 1 | t | 1400.230346843 | 1400.23034684253 2 | t | 1005.627279769 | 1005.62727976866
SELECT gid, _ST_DWithin(geog, tgeog, 1609,true ) , ST_Distance(geog, tgeog) , _ST_DistanceUnCached(geog, tgeog) As uncached_dist FROM dwithgeogbug CROSS JOIN ST_GeogFromText('POINT(-69.83262 43.43636)') As tgeog; gid | _st_dwithin | st_distance | uncached_dist -----+-------------+----------------+------------------ 1 | t | 1400.230346843 | 1400.23034684253 2 | f | 1005.627279769 | 1005.62727976866
how can a bigger radius result in false when a smaller radius results in true
comment:34 by , 11 years ago
although as of
POSTGIS="2.1.2dev r12213"
#2556 seems fixed, this issue still remains so apparently they are different.
comment:35 by , 11 years ago
Probably "fixed" at r12214, (see comment). If fix confirmed, also needs port to trunk, but precise test case needs to be stripped out to be brought into CUnit and debugged in detail, is likely a subtle failure in the tree-on-tree distance code when used with tolerances.
follow-up: 39 comment:36 by , 11 years ago
okay that seems to fix the issue but like you said its just a bandage doesn't solve your underlying no?
I have dwithinbug_data.sql attached above in case you didn't notice but the geometry that was failing in that dataset has 197 points. You think that's too big to stuff in cunit?
comment:37 by , 11 years ago
pramsey,
I have a potentially stupid question to ask. I see that: FP_LT macro is defined as
FP_LT (A, B ) (((A) + FP_TOLERANCE) < (B))
which is used here: http://postgis.net/docs/doxygen/2.1/de/dc0/lwgeodetic__tree_8c_a337b8cc452a495c52080254b042536c0.html#a337b8cc452a495c52080254b042536c0
Wouldn't that mean that when I define a threshold of X, it is really treating it as X + FP_TOLERANCE ?
So If I happen to hit a node that is between X and (X + FP_TOLERANCE) it would kick out success.
But my X < threshold would fail?
comment:39 by , 11 years ago
Replying to robe:
I have dwithinbug_data.sql attached above in case you didn't notice but the geometry that was failing in that dataset has 197 points. You think that's too big to stuff in cunit?
Well, it's not ideal, but barring having another option it'll have to do. I'll visually examine them to see if there's any clues about the issue there, but it's hard to know without some serious spelunking. The tolerance short circuit in the tree/tree code is quite complex, living in the midst of a recursive function and all. I'll see if I can get it replicated in C, and we can go from there.
comment:40 by , 11 years ago
Milestone: | PostGIS 2.1.2 → PostGIS 2.2.0 |
---|
okay you've bandaged this enough for 2.1.2 I think. Lets do the real fix in 2.1.3 or 2.2.0
comment:41 by , 11 years ago
I'd like to keep on this one, having the knowledge in my head, can you provide the SQL to run against the attached data to demonstrate the problem?
comment:42 by , 11 years ago
I think it was this one:
SELECT gid, _ST_DWithin(geog, tgeog, 1609,true ) , ST_Distance(geog, tgeog) , _ST_DistanceUnCached(geog, tgeog) As uncached_dist FROM dwithgeogbug CROSS JOIN ST_GeogFromText('POINT(-69.83262 43.43636)') As tgeog; gid | _st_dwithin | st_distance | uncached_dist -----+-------------+----------------+------------------ 1 | t | 1400.230346843 | 1400.23034684253 2 | f | 1005.627279769 | 1005.62727976866
my comment being -- how can a bigger radius result in false when a smaller radius results in true
I think it works now because of your prior bandage
comment:43 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
This should now be correctly fixed at r13954
comment:44 by , 9 years ago
regression failure on winnie:
geography .. failed (diff expected obtained: /projects/postgis/tmp/2.2.0dev_pg9.4_geos3.5.0_gdal2.0w64/test_38_diff) ----------------------------------------------------------------------------- --- geography_expected 2015-08-20 14:19:09 -0400 +++ /projects/postgis/tmp/2.2.0dev_pg9.4_geos3.5.0_gdal2.0w64/test_38_out 2015-08-20 14:22:07 -0400 @@ -21,9 +21,9 @@ geog_precision_pazafir|0|0 geog_precision_pazafir|0|0 1|MULTILINESTRING((0 0,1 1)) -#2422|2|1609|t|t|1005.62727977|1006.24818105|1005.62727976958|1005.62727976958 -#2422|2|1600|t|t|1005.62727977|1006.24818105|1005.62727976958|1005.62727976958 -#2422|2|1068|t|t|1005.62727977|1006.24818105|1005.62727976958|1005.62727976958 +#2422|2|1609|t|t|1005.62727977|1006.24818105|1005.62727976922|1005.62727976922 +#2422|2|1600|t|t|1005.62727977|1006.24818105|1005.62727976922|1005.62727976922 +#2422|2|1068|t|t|1005.62727977|1006.24818105|1005.62727976922|1005.62727976922 #2422|1|1609|t|t|1400.23034685|1396.81609491|1400.23034684753|1400.23034684753 #2422|1|1600|t|t|1400.23034685|1396.81609491|1400.23034684753|1400.23034684753 #2422|1|1068|f|f|1400.23034685|1396.81609491|1400.23034684753|1400.23034684753 -----------------------------------------------------------------------------
comment:45 by , 9 years ago
Regression failures fixed at r 13955
It might be hard to find the exact differences because this is going to be a difference between the tree distance and the edge distance algorithms, and the tree distance gets turned on as caching goes active. Replace testing ST_Dwithin with testing for differences between _ST_DistanceUnCached and _ST_DistanceTree to avoid cache effects.