Opened 5 years ago
Closed 4 years ago
#4686 closed enhancement (fixed)
Faster bbox access for PG12+
Reported by: | Algunenano | Owned by: | Algunenano |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.1.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
We currently have to ways to access the bbox of a geometry from "disk":
- gserialized_get_gbox_p: Tries to access the bbox, if it doesn't exist, deserializes the geometry and calculates it.
- gserialized_fast_gbox_p: Tries to access the bbox, if it doesn't exist, returns NULL.
In both cases if the datum was toasted, it has been detoasted (which is expensive).
I think (not tested performance impact yet) that there are multiple cases were we could benefit from following a different approach (similar to #4676):
- Starting with the Datum.
- Read the header only.
- Check the bbox from the header. If available, return it (huge win for toasted value).
- If we really want the bbox (first case), then if there is more data available in the datum* detoast it fully, deserialize it and calculate it.
The possible benefits from this are the callers to those 2 functions, including the ANALYZE calls (compute_gserialized_stats_mode
).
- I'd say most of the time there isn't any more data to read. The header considers a 4d bbox, that is 8 floats (8*32b = 256bits), and the extended flags (64b, which are seldom used). This is enough space for a single 4D point or 2 2D points. In any case, we could artificially increase the size request for the header if we wanted to.
Change History (4)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
I've been having a look and there are only a few places where I was able to use this:
- ANALYZE calls.
- _ST_PointOutside (I did this just to confirm it was useful more than anything).
- postgis_getbbox
- ST_CombineBbox(box2d, geometry)
- ST_ClipByBox2d when the geom is either fully outside or fully inside the bbox.
- ST_BoundingDiagonal
Obviously this only affects compressed geometries.
An example of the impact:
-- 13 big polygons (~260843 vertices on avg) ANALYZE benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020 (the_geom, the_geom_webmercator); -- Master: 67.960 -- Faster bbox: 2.315
It's currently living in https://github.com/Algunenano/postgis/tree/faster_box. I'll create a PR next week with some more numbers.
comment:3 by , 4 years ago
PR: https://github.com/postgis/postgis/pull/572
Numbers:
- ANALYZE:
------------------ POINTS ------------------- -- ANALYZE benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (the_geom, the_geom_webmercator); -- Master: 116.324 -- Faster bbox: 117.262 -- Perf: 0.99x ---------------- BIG POLYGONS ------------------- -- ANALYZE benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020 (the_geom, the_geom_webmercator); -- Master: 70.258 -- Faster bbox: 2.339 -- Perf: 30.03x
- ST_PointOutside:
--------------- BIG POLYGONS ---------------- --------------------------------------------- -- EXPLAIN ANALYZE Select _ST_PointOutside(the_geog) from test_geog; -- Master: 16.925 -- Faster bbox: 0.158 -- Perf: 107.12x
- ST_CombineBbox:
--------------------------------------------- -------------- SMALL POLYGONS --------------- EXPLAIN ANALYZE Select ST_CombineBbox(NULL::box2d, the_geom) from benchmark_af85af967525e69373b7b8678f3bd409a8e78f447671344b7535d; -- Master: 290.640 -- Faster bbox: 311.152 -- Perf: 0.93x -------------- BIG POLYGONS ---------------- -- EXPLAIN ANALYZE Select ST_CombineBbox(NULL::box2d, the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020; -- Master: 17.288 -- Faster bbox: 0.165 -- Perf: 104.77x
- ST_ClipByBox2D (fully inside):
-------------- SMALL POLYGONS --------------- -- EXPLAIN ANALYZE Select count(*) FROM benchmark_af85af967525e69373b7b8678f3bd409a8e78f447671344b7535d WHERE ST_ClipByBox2D(the_geom_webmercator, ST_TileEnvelope(0,0,0)) IS NOT NULL; -- Master: 155.528 -- Faster bbox: 129.040 -- Perf: 1.21x -------------- BIG POLYGONS ---------------- -- EXPLAIN ANALYZE Select count(*) FROM benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020 WHERE ST_ClipByBox2D(the_geom_webmercator, ST_TileEnvelope(0,0,0)) IS NOT NULL; -- Master: 19.266 -- Faster bbox: 0.220 -- Perf: 87.39x
- ST_ClipByBox2D (fully outside):
-------------- SMALL POLYGONS --------------- -- EXPLAIN ANALYZE Select count(*) FROM benchmark_af85af967525e69373b7b8678f3bd409a8e78f447671344b7535d WHERE ST_ClipByBox2D(the_geom_webmercator, ST_TileEnvelope(20,0,0)) IS NOT NULL; -- Master: 171.451 -- Faster bbox: 147.365 -- Perf: 1.16x -------------- BIG POLYGONS ---------------- -- EXPLAIN ANALYZE Select count(*) FROM benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020 WHERE ST_ClipByBox2D(the_geom_webmercator, ST_TileEnvelope(20,0,0)) IS NOT NULL; -- Master: 19.265 -- Faster bbox: 0.226 -- Perf: 85.24x
- ST_BoundingDiagonal (fully outside):
-------------- SMALL POLYGONS --------------- -- EXPLAIN ANALYZE Select ST_BoundingDiagonal(the_geom) from benchmark_af85af967525e69373b7b8678f3bd409a8e78f447671344b7535d; -- Master: 566.389 -- Faster bbox: 446.435 -- Perf: 1.26x -------------- BIG POLYGONS ---------------- -- EXPLAIN ANALYZE Select ST_BoundingDiagonal(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020; -- Master: 19.279 -- Faster bbox: 0.153 -- Perf: 126x
A function that could benefit from something like this would be ST_ClipByBox2D which has 2 different shortcuts (geom bbox inside box, geom bbox fully outsize the bbox).
One of the common use cases for ST_ClipByBox2D is to clip the webmercator world before transforming geometries into 3857, and you expect most of the source geoemtries to fall inside the box.