Opened 5 years ago
Closed 5 years ago
#4674 closed defect (fixed)
lwgeom_cache doesn't live through not inline functions
Reported by: | Algunenano | Owned by: | pramsey |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.1.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
While I was working on #4672, I noticed that the ST_AsGML cache wasn't working all the time (in fact it was working with geographies but not geometries).
After adding some logs, it seems that if the function is inlined, the cache (installed in the parent memory context) lives for the next row, but if it isn't then you loose the cache and everything needs to be recreated again.
Example that isn't inlined:
# explain (analyze, verbose) Select ST_AsGML(the_geom::geometry) from benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 limit 4;NOTICE: GetGenericCacheCollection not found NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null NOTICE: GetGenericCacheCollection not found NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null NOTICE: GetGenericCacheCollection not found NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null NOTICE: GetGenericCacheCollection not found NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..1.25 rows=4 width=32) (actual time=0.780..2.497 rows=4 loops=1) Output: (st_asgml(the_geom, 15, 0)) -> Seq Scan on public.benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (cost=0.00..213260.58 rows=683833 width=32) (actual time=0.778..2.492 rows=4 loops=1) Output: st_asgml(the_geom, 15, 0) Planning Time: 0.213 ms Execution Time: 2.534 ms (6 rows)
Example that is inlined:
# explain (analyze, verbose) Select ST_AsGML(3, the_geom::geometry) from benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 limit 4; NOTICE: GetGenericCacheCollection not found NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: Cache FOUND. Expected: 4326 1 QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..10.25 rows=4 width=32) (actual time=0.418..0.535 rows=4 loops=1) Output: (_st_asgml(3, the_geom, 15, 0, NULL::text, NULL::text)) -> Seq Scan on public.benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (cost=0.00..1751884.83 rows=683833 width=32) (actual time=0.417..0.533 rows=4 loops=1) Output: _st_asgml(3, the_geom, 15, 0, NULL::text, NULL::text) Planning Time: 0.152 ms Execution Time: 0.562 ms (6 rows)
I'm not sure if this affects more functions (probably anything doing indirect calls within SQL is susceptible to this behaviour) but it's indeed a PITA.
Is there a query context that we can hook us to that lives for the whole duration of the query (instead of a transaction / function)?
Change History (10)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
What if we don't store the pointer to GenericCacheCollection
it in fn_extra? We could have a static pointer to the cache and register a callback to nullify the pointer when the context is deleted, right?
We do something similar with the projection cache (to call the PROJ destructor when the context is being removed).
comment:3 by , 5 years ago
I've made it work using CurTransactionContext
but that only free's the cache at the end of the full transaction.
For example, using a BEGIN-END group:
cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# BEGIN; BEGIN cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# explain (analyze, verbose) Select ST_AsGML(the_geom) from benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 limit 4; NOTICE: GetGenericCacheCollection NOT FOUND NOTICE: generic_cache not found NOTICE: Cache not found. Expected: 4326 1. Found: 0 0 null NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------- Limit (cost=0.00..1.25 rows=4 width=32) (actual time=1.112..1.417 rows=4 loops=1) Output: (st_asgml(the_geom, 15, 0)) -> Seq Scan on public.benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (cost=0.00..213260.58 rows=683833 width=32) (actual time=1.110..1.414 rows=4 loops=1) Output: st_asgml(the_geom, 15, 0) Planning Time: 0.459 ms Execution Time: 1.443 ms (6 rows) cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# explain (analyze, verbose) Select ST_AsGML(3, the_geom::geometry) from benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 limit 4; NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: Cache FOUND. Expected: 4326 1 QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------- Limit (cost=0.00..10.25 rows=4 width=32) (actual time=0.096..0.273 rows=4 loops=1) Output: (_st_asgml(3, the_geom, 15, 0, NULL::text, NULL::text)) -> Seq Scan on public.benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (cost=0.00..1751884.83 rows=683833 width=32) (actual time=0.095..0.270 rows=4 loops=1) Output: _st_asgml(3, the_geom, 15, 0, NULL::text, NULL::text) Planning Time: 0.110 ms Execution Time: 0.293 ms (6 rows) cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# explain (analyze, verbose) Select ST_Transform(the_geom, 3857), ST_Transform(the_geom, 3857) from benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 limit 4; NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! NOTICE: GetGenericCacheCollection FOUND!!!!! QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------- Limit (cost=0.00..20.25 rows=4 width=64) (actual time=7.136..7.202 rows=4 loops=1) Output: (st_transform(the_geom, 3857)), (st_transform(the_geom, 3857)) -> Seq Scan on public.benchmark_7773a711c8441d4b494a51fd9feebeac7a9b9c734619398620293 (cost=0.00..3461467.33 rows=683833 width=64) (actual time=7.135..7.201 rows=4 loops=1) Output: st_transform(the_geom, 3857), st_transform(the_geom, 3857) Planning Time: 0.040 ms Execution Time: 7.233 ms (6 rows) cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=# END; NOTICE: PostgisResetInternalCache COMMIT cartodb_dev_user_f80dfdef-ea5c-498d-b98a-82362650d944_db=#
Notice how END
resets the cache.
I think it would be best if the cache only lives until the end of a single query (and not the full transaction) but I haven't found out how to do it yet.
comment:4 by , 5 years ago
I've been looking for it but I don't see any simple way to get the memory context of a single query.
So let's describe the 2 options:
- Use
flinfo->fn_mcxt
- Cache is done per function call.
- Calling
Select ST_Transform(g, 3857), ST_Transform(g2, 3857) from table
will do 2 cache instantiations. - Calling the previous query 2 times inside a transaction (
BEGIN; query; query; END;
) will do 2 x 2 cache instantiations. - Doesn't work across pure SQL functions (unless inlined).
- Use
CurTransactionContext
.- Cache is done per transaction.
- Calling
Select ST_Transform(g, 3857), ST_Transform(g2, 3857) from table
will do only 1 instantiation. - Calling the previous query 2 times inside a transaction (
BEGIN; query; query; END;
) will create the cache only once (no matter the amount of times you call the query). - Works across "everything", as is kept alive until the end of the transaction.
This second approach can be considered more dangerous in case of a leak in the cached objects (as things won't necessarily be deleted immediately) but considering that the boundaries of all the different caches are well established I don't expect it to be an issue.
There is a possible "issue" related to doing a transaction with a ST_Transform, then a modification to spatial_ref_sys, and then another ST_Transform and expect it to be different (but it isn't because it is cached). After giving it some thought I don't consider this a problem since the functions are all labelled as IMMUTABLE, so I don't see a problem with requesting different transactions if you really want this kind of behaviour.
What do you all think? If I don't hear big cries I'll clean up the patch (and break it apart from the changes for #4672) and push it. In any case I'll leave things to be easily reverted (simply change CurTransactionContext
to flinfo->fn_mcxt
).
comment:6 by , 5 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Reopening after a discussion in Github. I intend to revert this change and find a better way to keep the ST_AsGML optimization (by removing the inline SQL).
comment:7 by , 5 years ago
I tried reverting this change but it seems that fcinfo->flinfo
might be NULL in some cases and it's crashing with the new geojson cache:
(gdb) bt #0 0x00007f24776bda8a in GetGenericCacheCollection (fcinfo=0x7ffe6199df80) at lwgeom_cache.c:79 #1 SRIDCacheGet (fcinfo=0x7ffe6199df80) at lwgeom_cache.c:506 #2 GetSRIDCacheBySRS (fcinfo=0x7ffe6199df80, srs=0x558057f24260 "EPSG:3005") at lwgeom_cache.c:520 #3 0x00007f24776517fd in LWGEOM_in (fcinfo=0x7ffe6199df80) at lwgeom_inout.c:150 #4 0x0000558056faab75 in DirectFunctionCall1Coll (func=0x7f2477651670 <LWGEOM_in>, collation=1475494496, arg1=0) at fmgr.c:803 #5 0x00007f247765257e in parse_WKT_lwgeom (fcinfo=0x558057eef4f0) at lwgeom_inout.c:662 #6 0x0000558056cc350d in ExecInterpExpr (state=0x558057eef410, econtext=0x558057eef570, isnull=<optimized out>) at execExprInterp.c:649 #7 0x0000558056d92a8e in ExecEvalExprSwitchContext (state=<optimized out>, econtext=0x558057f24260, isNull=0x7ffe6199e11f) at ../../../../src/include/executor/executor.h:307 #8 evaluate_expr (expr=<optimized out>, result_type=10775331, result_typmod=-1, result_collation=0) at clauses.c:4812 #9 0x0000558056d93664 in evaluate_function (funcid=10775691, result_type=10775331, result_typmod=-1, result_collid=0, input_collid=100, args=0x55805832d138, funcvariadic=<optimized out>, context=0x7ffe6199e660, func_tuple=<optimized out>) at clauses.c:4354 #10 simplify_function (funcid=10775691, result_type=10775331, result_typmod=-1, result_collid=0, input_collid=100, args_p=<optimized out>, funcvariadic=<optimized out>, process_args=<optimized out>, allow_non_const=true, context=0x7ffe6199e660) at clauses.c:3984 #11 0x0000558056d91864 in eval_const_expressions_mutator (node=0x558057e0a1b0, context=0x7ffe6199e660) at clauses.c:2477 #12 0x0000558056d26b06 in expression_tree_mutator (node=<optimized out>, mutator=0x558056d90c30 <eval_const_expressions_mutator>, context=0x7ffe6199e660) at nodeFuncs.c:3012 #13 0x0000558056d933e0 in simplify_function (funcid=10775573, result_type=25, result_typmod=-1, result_collid=100, input_collid=0, args_p=0x7ffe6199e470, funcvariadic=<optimized out>, process_args=<optimized out>, allow_non_const=<optimized out>, context=0x7ffe6199e660) at clauses.c:3975 #14 0x0000558056d91864 in eval_const_expressions_mutator (node=0x55805832c5b8, context=0x7ffe6199e660) at clauses.c:2477 #15 0x0000558056d26829 in expression_tree_mutator (node=0x55805832c608, mutator=0x558056d90c30 <eval_const_expressions_mutator>, context=0x7ffe6199e660) at nodeFuncs.c:2762 #16 0x0000558056d90d4e in eval_const_expressions_mutator (node=0x55805832c608, context=0x7ffe6199e660) at clauses.c:3539 #17 0x0000558056d26b06 in expression_tree_mutator (node=<optimized out>, mutator=0x558056d90c30 <eval_const_expressions_mutator>, context=0x7ffe6199e660) at nodeFuncs.c:3012 #18 0x0000558056d90d4e in eval_const_expressions_mutator (node=0x558057e099d0, context=0x7ffe6199e660) at clauses.c:3539 #19 0x0000558056d90c10 in eval_const_expressions (root=<optimized out>, node=0x0) at clauses.c:2269 #20 0x0000558056d794ab in preprocess_expression (root=<optimized out>, expr=0x558057f24260, kind=1) at planner.c:1087 #21 subquery_planner (glob=<optimized out>, parse=0x558057e09800, parent_root=<optimized out>, hasRecursion=<optimized out>, tuple_fraction=0) at planner.c:769 #22 0x0000558056d78aaf in standard_planner (parse=0x558057e09800, cursorOptions=<optimized out>, boundParams=0x0) at planner.c:406 #23 0x0000558056e51e60 in pg_plan_query (querytree=0x558057e09800, cursorOptions=256, boundParams=0x0) at postgres.c:878 #24 pg_plan_queries (querytrees=<optimized out>, cursorOptions=256, boundParams=0x0) at postgres.c:968 #25 0x0000558056e562b4 in exec_simple_query (query_string=0x558057e08500 "SELECT 'cast2', ST_AsEWKT(st_asgeojson('SRID=3005;MULTIPOINT(1 1, 1 1)'::geometry)::geometry);") at postgres.c:1143 #26 0x0000558056e53d34 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4243 #27 0x0000558056dc2ef7 in BackendRun (port=0x558057e2f5b0) at postmaster.c:4437 #28 0x0000558056dc24df in BackendStartup (port=<optimized out>) at postmaster.c:4128 #29 ServerLoop () at postmaster.c:1704 #30 0x0000558056dbf036 in PostmasterMain (argc=3, argv=0x558057e02230) at postmaster.c:1377 #31 0x0000558056d247d5 in main (argc=3, argv=0x558057e02230) at main.c:228 (gdb) f 0 #0 0x00007f24776bda8a in GetGenericCacheCollection (fcinfo=0x7ffe6199df80) at lwgeom_cache.c:79 79 GenericCacheCollection* cache = fcinfo->flinfo->fn_extra; (gdb) p *fcinfo $6 = {flinfo = 0x0, context = 0x0, resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 1, args = 0x7ffe6199dfa0}
It seems that DirectFunctionCall1 needs to be replaced by CallerFInfoFunctionCall1 to ensure flinfo
is initialized.
comment:9 by , 5 years ago
Pushed the revert.
The change to ST_AsGML to avoid the issue with the inline blocking the cache usage is still pending, so I'm keeping the issue open.
I'm thinking about changing the signature to also use LWGEOM_asGML
and in there detect if the first parameter is an int or a geometry and act accordingly. I remember seeing something similar on some other function (iterating over the parameters) but I have to find where.
Even if we put the cache into TopTransactionContext or something like that, there's no way we could necessarily reach it again if
fn_extra
is getting wiped out. I don't think there's anything we can do.