Opened 8 months ago

Last modified 7 months ago

#5691 new enhancement

Revive tiger_topology function

Reported by: robe Owned by: robe
Priority: medium Milestone: PostGIS 3.5.0
Component: tiger geocoder Version: master
Keywords: Cc:

Description

I had written topology_load_tiger eons ago like 20 years ago partly as a tool to stress test topology and also to see if I could retrofit an existings SQL/MM topology model into postgis_topology.

Now that so much is going on with postgis_topology, seems like a good idea to revive that function and revise it to use some of the new enhancements in topology.

The idea I had behind it, was to take the TIGER schema setup (Topologically Integrated Geographic Encoding and Referencing system), which is also based on SQL/MM Topology model (but that they had built on Oracle), and use the same ids they have etc to directly load into a postgis_topology setup (totally skipping all that nice work Sandro had done to load geometries into postgis topology), but then use the validation functions and then moving forward see how it could be edited with postgis_topology.

Somewhere down the line, my original tests became horribly slow and I was too lazy to investigate what went wrong, if it was a change in GEOS or addition of triggers etc. in topology.

Change History (2)

comment:1 by robe, 7 months ago

I did a baseline, yeh it still at least works. Doing my baseline on demo.postgis.net on tiger_2023 database

My baseline was this:

SELECT topology.CreateTopology('topo_dc', 4269);
SELECT cntyidfp FROM tiger.county where statefp = '11';

-- took 1 min 13 secs.
SELECT tiger.topology_load_tiger('topo_dc', 'county', '11001');
SELECT topology.TopologySummary('topo_dc');

outputs:

Topology topo_dc (id 1, SRID 4269, precision 0)
13152 nodes, 21663 edges, 8673 faces, 0 topogeoms in 0 layers

--comparing against the edges and nodes for DC

SELECT (SELECT count(*) FROM tiger.edges WHERE statefp = '11') AS cnt_orig_edges
, (SELECT count(*) FROM tiger.faces WHERE statefp = '11') AS cnt_orig_faces;
cnt_orig_edges | cnt_orig_faces
----------------+----------------
          21663 |           8673
(1 row)

Yeh so that matches

Now for bad news,

SELECT error, count(*) 
FROM  topology.ValidateTopology('topo_dc')
GROUP BY error;

Returns:

         error          | count
-------------------------+-------
 invalid next_left_edge  | 21073
 invalid next_right_edge | 21409
(2 rows)

But I suspect this is my accounting logic shoving the tiger edges and faces into PostGIS topology.

Since I am using the existing tiger edge ids, I also need to reset the sequence id of the topology, which it appears I am not doing, so future possible adds don't conflict with that's been added.

Although it looks like I'm literally taking what tiger is giving me cause in the tiger.edges structure I am mapping to postgis topology edge_data structure as follows

 , t.tnidf As start_node, t.tnidt As end_node, t.tfidl As left_face
                        , t.tfidr As right_face, tl.tlid AS next_left_edge,  tr.tlid As next_right_edge

I am doing some extra math though with negating things and closing stuff up if it doesn't match, so the issue might be there.

Version 0, edited 7 months ago by robe (next)

comment:2 by strk, 7 months ago

Next left is the id of the edge an ant will encounter while walking on the left side of the edge in the same direction of the edge. Next right is the one encountered by walking on the other side and the opposite direction. Those values are: positive if the next edge is encountered from its start point, negative if it is encountered from its end point.

An SQL exposed function to compute the linking is currently unavailable (see #4942), the closest you get is (IIRC) the GetNodeEdges which computes the order rather than using those "next" values - see https://postgis.net/docs/GetNodeEdges.html

Note: See TracTickets for help on using tickets.