Opened 7 months ago

Closed 7 months ago

Last modified 7 months ago

#5707 closed enhancement (fixed)

slow cb_getEdgeByFace seems like missing value for bbox parameter

Reported by: Lars Aksel Opsahl Owned by: strk
Priority: medium Milestone: PostGIS 3.4.3
Component: topology Version: 3.4.x
Keywords: Cc:

Description

In this case push data in to black box in the attached area where there are no data from before .

From pg_stat_activity this in -[ RECORD 1 ]--

total_min | 1.6338761423000014
avg_ms    | 123.15649313819097
calls     | 796
query     | SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE ( left_face = ANY($1)  OR right_face = ANY ($1) )

If I run a test SQL like below we that takes 350 ms.

 explain analyze              SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE ( left_face = ANY('{0}')  OR right_face = ANY ('{0}') )
;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on edge_data  (cost=0.00..29820.10 rows=94479 width=162) (actual time=0.072..346.232 rows=78609 loops=1)
   Filter: ((left_face = ANY ('{0}'::integer[])) OR (right_face = ANY ('{0}'::integer[])))
   Rows Removed by Filter: 243439
 Planning Time: 0.180 ms
 Execution Time: 348.289 ms
(5 rows)


but if I add the bbox paramater it run more than 100 times faster and takes less 3 ms.

 explain analyze              SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE geom && '0103000020A21000000100000005000000FC705E4CD73D15404AD7234690F24D40FC705E4CD73D15405870E3E178F54D40E01FF517366A15405870E3E178F54D40E01FF517366A15404AD7234690F24D40FC705E4CD73D15404AD7234690F24D40' AND ( left_face = ANY('{0}')  OR right_face = ANY ('{0}') )
;
                                                                                                                  QUERY PLAN                                                                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on edge_data  (cost=7.37..133.12 rows=37 width=162) (actual time=2.557..2.610 rows=9 loops=1)
   Recheck Cond: (geom && '0103000020A21000000100000005000000FC705E4CD73D15404AD7234690F24D40FC705E4CD73D15405870E3E178F54D40E01FF517366A15405870E3E178F54D40E01FF517366A15404AD7234690F24D40FC705E4CD73D15404AD7234690F24D40'::geometry)
   Filter: ((left_face = ANY ('{0}'::integer[])) OR (right_face = ANY ('{0}'::integer[])))
   Heap Blocks: exact=3
   ->  Bitmap Index Scan on edge_gist  (cost=0.00..7.36 rows=127 width=0) (actual time=2.519..2.519 rows=9 loops=1)
         Index Cond: (geom && '0103000020A21000000100000005000000FC705E4CD73D15404AD7234690F24D40FC705E4CD73D15405870E3E178F54D40E01FF517366A15405870E3E178F54D40E01FF517366A15404AD7234690F24D40FC705E4CD73D15404AD7234690F24D40'::geometry)
 Planning Time: 4.996 ms
 Execution Time: 2.753 ms
(8 rows)


I assume the bbox parameter could be based on the line added.

I have done vacuum analyze a all the tables and removed all dead rows.

I am running on a compiled postgis code from last week at

git rev-parse HEAD
d5ba6b71cf08baaf67d9646a7ca52d0dfb5c3e9c

git status
HEAD detached at d5ba6b71c

Yes in this case I could use https://trac.osgeo.org/postgis/ticket/5667 but in other cases we also need to add single rows.

Attachments (1)

Screenshot 2024-04-02 at 05.20.13.png (88.6 KB ) - added by Lars Aksel Opsahl 7 months ago.
used_case_picture

Download all attachments as: .zip

Change History (7)

by Lars Aksel Opsahl, 7 months ago

used_case_picture

comment:1 by Lars Aksel Opsahl, 7 months ago

Here are the table def

\d "tmp_dyrkbarjord_04_t3_d_005".edge_data;

                                             Unlogged table "tmp_dyrkbarjord_04_t3_d_005.edge_data"
       Column        |           Type            | Collation | Nullable |                                Default                                 
---------------------+---------------------------+-----------+----------+------------------------------------------------------------------------
 edge_id             | integer                   |           | not null | nextval('tmp_dyrkbarjord_04_t3_d_005.edge_data_edge_id_seq'::regclass)
 start_node          | integer                   |           | not null | 
 end_node            | integer                   |           | not null | 
 next_left_edge      | integer                   |           | not null | 
 abs_next_left_edge  | integer                   |           | not null | 
 next_right_edge     | integer                   |           | not null | 
 abs_next_right_edge | integer                   |           | not null | 
 left_face           | integer                   |           | not null | 
 right_face          | integer                   |           | not null | 
 geom                | geometry(LineString,4258) |           | not null | 
Indexes:
    "edge_data_pkey" PRIMARY KEY, btree (edge_id)
    "edge_data_abs_next_left_edge_idx" btree (abs_next_left_edge)
    "edge_data_abs_next_right_edge_idx" btree (abs_next_right_edge)
    "edge_end_node_idx" btree (end_node)
    "edge_gist" gist (geom)
    "edge_left_face_idx" btree (left_face)
    "edge_right_face_idx" btree (right_face)
    "edge_start_node_idx" btree (start_node)
Foreign-key constraints:
    "end_node_exists" FOREIGN KEY (end_node) REFERENCES tmp_dyrkbarjord_04_t3_d_005.node(node_id)
    "left_face_exists" FOREIGN KEY (left_face) REFERENCES tmp_dyrkbarjord_04_t3_d_005.face(face_id)
    "next_left_edge_exists" FOREIGN KEY (abs_next_left_edge) REFERENCES tmp_dyrkbarjord_04_t3_d_005.edge_data(edge_id) DEFERRABLE INITIALLY DEFERRED
    "next_right_edge_exists" FOREIGN KEY (abs_next_right_edge) REFERENCES tmp_dyrkbarjord_04_t3_d_005.edge_data(edge_id) DEFERRABLE INITIALLY DEFERRED
    "right_face_exists" FOREIGN KEY (right_face) REFERENCES tmp_dyrkbarjord_04_t3_d_005.face(face_id)
    "start_node_exists" FOREIGN KEY (start_node) REFERENCES tmp_dyrkbarjord_04_t3_d_005.node(node_id)
Referenced by:
    TABLE "tmp_dyrkbarjord_04_t3_d_005.edge_data" CONSTRAINT "next_left_edge_exists" FOREIGN KEY (abs_next_left_edge) REFERENCES tmp_dyrkbarjord_04_t3_d_005.edge_data(edge_id) DEFERRABLE INITIALLY DEFERRED
    TABLE "tmp_dyrkbarjord_04_t3_d_005.edge_data" CONSTRAINT "next_right_edge_exists" FOREIGN KEY (abs_next_right_edge) REFERENCES tmp_dyrkbarjord_04_t3_d_005.edge_data(edge_id) DEFERRABLE INITIALLY DEFERRED


and yes it does not use the index with simple query either

 explain analyze              SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE ( left_face = 0 )  OR (right_face = 0 )  
;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on edge_data  (cost=0.00..30625.22 rows=94479 width=162) (actual time=0.017..126.493 rows=78609 loops=1)
   Filter: ((left_face = 0) OR (right_face = 0))
   Rows Removed by Filter: 243439
 Planning Time: 0.161 ms
 Execution Time: 128.823 ms
(5 rows)


Just to sure I rerun a new vacuum

VACUUM (analyze, verbose) "tmp_dyrkbarjord_04_t3_d_005".edge_data;
INFO:  00000: vacuuming "tmp_dyrkbarjord_04_t3_d_005.edge_data"
LOCATION:  lazy_scan_heap, vacuumlazy.c:521
INFO:  00000: scanned index "edge_data_pkey" to remove 21147 row versions
DETAIL:  CPU: user: 0.04 s, system: 0.01 s, elapsed: 0.05 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_data_abs_next_left_edge_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.02 s, elapsed: 0.05 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_data_abs_next_right_edge_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.02 s, elapsed: 0.04 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_gist" to remove 21147 row versions
DETAIL:  CPU: user: 0.03 s, system: 0.03 s, elapsed: 0.06 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_left_face_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.03 s, system: 0.01 s, elapsed: 0.04 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_right_face_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.02 s, elapsed: 0.04 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_start_node_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.02 s, system: 0.01 s, elapsed: 0.04 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: scanned index "edge_end_node_idx" to remove 21147 row versions
DETAIL:  CPU: user: 0.01 s, system: 0.03 s, elapsed: 0.04 s
LOCATION:  lazy_vacuum_index, vacuumlazy.c:1748
INFO:  00000: "edge_data": removed 21147 row versions in 860 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
LOCATION:  lazy_vacuum_heap, vacuumlazy.c:1557
INFO:  00000: index "edge_data_pkey" now contains 322048 row versions in 8461 pages
DETAIL:  21147 index row versions were removed.
177 index pages have been deleted, 102 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_data_abs_next_left_edge_idx" now contains 322048 row versions in 8488 pages
DETAIL:  21147 index row versions were removed.
175 index pages have been deleted, 100 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_data_abs_next_right_edge_idx" now contains 322048 row versions in 8429 pages
DETAIL:  21147 index row versions were removed.
170 index pages have been deleted, 94 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_gist" now contains 322048 row versions in 12886 pages
DETAIL:  21147 index row versions were removed.
1298 index pages have been deleted, 1202 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_left_face_idx" now contains 322048 row versions in 8397 pages
DETAIL:  21147 index row versions were removed.
1035 index pages have been deleted, 1015 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_right_face_idx" now contains 322048 row versions in 8483 pages
DETAIL:  21147 index row versions were removed.
692 index pages have been deleted, 674 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_start_node_idx" now contains 322048 row versions in 8452 pages
DETAIL:  21147 index row versions were removed.
178 index pages have been deleted, 102 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: index "edge_end_node_idx" now contains 322048 row versions in 8472 pages
DETAIL:  21147 index row versions were removed.
167 index pages have been deleted, 93 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: "edge_data": found 207 removable, 8790 nonremovable row versions in 871 out of 51589 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 533943927
There were 7489 unused item identifiers.
Skipped 0 pages due to buffer pins, 20492 frozen pages.
0 pages are entirely empty.
CPU: user: 0.21 s, system: 0.19 s, elapsed: 0.41 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1488
INFO:  00000: "edge_data": truncated 51589 to 50631 pages
DETAIL:  CPU: user: 0.10 s, system: 0.01 s, elapsed: 0.11 s
LOCATION:  lazy_truncate_heap, vacuumlazy.c:1969
INFO:  00000: vacuuming "pg_toast.pg_toast_2235142925"
LOCATION:  lazy_scan_heap, vacuumlazy.c:521
INFO:  00000: index "pg_toast_2235142925_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_cleanup_index, vacuumlazy.c:1801
INFO:  00000: "pg_toast_2235142925": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 533943928
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1488
INFO:  00000: analyzing "tmp_dyrkbarjord_04_t3_d_005.edge_data"
LOCATION:  do_analyze_rel, analyze.c:327
INFO:  00000: "edge_data": scanned 50631 of 50631 pages, containing 322048 live rows and 0 dead rows; 300000 rows in sample, 322048 estimated total rows
LOCATION:  acquire_sample_rows, analyze.c:1132

and rerun this simple sql still with seq scan

explain analyze              SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE ( left_face = 0 )  OR (right_face = 0 ) 
;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on edge_data  (cost=0.00..30146.22 rows=94523 width=162) (actual time=0.017..120.990 rows=78609 loops=1)
   Filter: ((left_face = 0) OR (right_face = 0))
   Rows Removed by Filter: 243439
 Planning Time: 0.775 ms
 Execution Time: 123.262 ms
(5 rows)



Even if rewrite the sql so uses the indexes, we never see anything close the the performance by using the bbox parameter

explain analyze              SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE left_face = 0 UNION  SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE right_face = 0 ; 
;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=46293.27..47311.23 rows=101796 width=44) (actual time=151.638..164.441 rows=78609 loops=1)
   Group Key: edge_data.edge_id, edge_data.left_face, edge_data.right_face, edge_data.geom
   ->  Append  (cost=1189.75..45275.31 rows=101796 width=44) (actual time=14.022..91.000 rows=101787 loops=1)
         ->  Bitmap Heap Scan on edge_data  (cost=1189.75..18353.36 rows=35139 width=162) (actual time=14.021..39.493 rows=35127 loops=1)
               Recheck Cond: (left_face = 0)
               Heap Blocks: exact=8953
               ->  Bitmap Index Scan on edge_left_face_idx  (cost=0.00..1180.96 rows=35139 width=0) (actual time=10.924..10.925 rows=35127 loops=1)
                     Index Cond: (left_face = 0)
         ->  Bitmap Heap Scan on edge_data edge_data_1  (cost=2273.01..25395.02 rows=66657 width=162) (actual time=11.529..47.015 rows=66660 loops=1)
               Recheck Cond: (right_face = 0)
               Heap Blocks: exact=15257
               ->  Bitmap Index Scan on edge_right_face_idx  (cost=0.00..2256.35 rows=66657 width=0) (actual time=9.506..9.506 rows=66660 loops=1)
                     Index Cond: (right_face = 0)
 Planning Time: 0.261 ms
 Execution Time: 168.322 ms
(15 rows)

comment:2 by strk, 7 months ago

Great analysis! The expensive query is probably originating in the _lwt_AddFaceSplit C function when a new edge splits a face. In your case the face being split seems to be the universe face.

Edges are fetched to update their left/right face if they now fall within the new face, so it does make sense to use the bounding box of the new face and the code DOES already pass such bounding box: https://git.osgeo.org/gitea/postgis/postgis/src/commit/d5ba6b71cf08baaf67d9646a7ca52d0dfb5c3e9c/liblwgeom/topo/lwgeom_topo.c#L2075

I'm trying to figure out how that MBR ends up being NULL.

comment:3 by strk, 7 months ago

It turns out the MBR is set to NULL by the call to cb_insertFace, since commit [56023cc5b8fc6bb2f2a57483635e800c6c5eb446/git] which was a fix to ticket #5684 (commit from 5 weeks ago) - this should mean only current master branch is affected, and not 3.4 branch

in reply to:  3 comment:4 by Lars Aksel Opsahl, 7 months ago

Replying to strk:

It turns out the MBR is set to NULL by the call to cb_insertFace, since commit [56023cc5b8fc6bb2f2a57483635e800c6c5eb446/git] which was a fix to ticket #5684 (commit from 5 weeks ago) - this should mean only current master branch is affected, and not 3.4 branch

Thanks this explains why we where able to push more than 70 million polygons through Postgis Topology on Postgis 3.4.

comment:5 by Sandro Santilli <strk@…>, 7 months ago

Resolution: fixed
Status: newclosed

In 94321512/git:

Revert "Always initialize the mbr member of fetched faces"

This reverts commit 56023cc5b8fc6bb2f2a57483635e800c6c5eb446.

Closes #5707
Reopens #5684

comment:6 by Lars Aksel Opsahl, 7 months ago

Very nice, for record rerun test area with the new code now and we down from 123 ms to 0.09 ms for the sane area as you see.

total_min | 0.0019957501166666656
avg_ms    | 0.0966464947538336
calls     | 1239
query     | SELECT edge_id,left_face,right_face,geom FROM "tmp_dyrkbarjord_04_t3_d_005".edge_data WHERE ( left_face = ANY($1)  OR right_face = ANY ($1) ) AND geom && $2
Note: See TracTickets for help on using tickets.