Opened 10 years ago
Closed 10 years ago
#3080 closed defect (fixed)
PostgreSQL 9.5 regress failure on st_reclass: possible performance regression for all of PostGIS base
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | low | Milestone: | PostGIS 2.2.0 |
Component: | raster | Version: | master |
Keywords: | Cc: |
Description
rt_reclass .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/regress_pgdev/tmp/2_2_pg9.5w64/test_49_diff) ----------------------------------------------------------------------------- --- rt_reclass_expected 2014-04-26 22:51:48.000000000 -0700 +++ /var/lib/jenkins/workspace/postgis/regress_pgdev/tmp/2_2_pg9.5w64/test_49_out 2015-03-16 19:39:25.000000000 -0700 @@ -5,7 +5,11 @@ 59|59|0 141|141|0 NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster +NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster +NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster 3.1415901184082|2.71828007698059|0 NOTICE: Invalid argument for reclassargset. Invalid band index (must use 1-based) for reclassarg of index 0 . Returning original raster +NOTICE: Invalid argument for reclassargset. Invalid band index (must use 1-based) for reclassarg of index 0 . Returning original raster +NOTICE: Invalid argument for reclassargset. Invalid band index (must use 1-based) for reclassarg of index 0 . Returning original raster 3.1415901184082|2.71828007698059|0 900|-900|900
This is against:
-------------- Dependencies -------------- GEOS config: /var/lib/jenkins/workspace/geos/rel-3.4.3devw64/bin/geos-config GEOS version: 3.4.3dev GDAL config: /var/lib/jenkins/workspace/gdal/rel-2.0w64/bin/gdal-config GDAL version: 2.0.0 PostgreSQL config: /var/lib/jenkins/workspace/pg/rel/pg9.5w64/bin/pg_config PostgreSQL version: PostgreSQL 9.5devel PROJ4 version: 47 Libxml2 config: /usr/bin/xml2-config Libxml2 version: 2.7.8 JSON-C support: yes PCRE support: yes PostGIS debug level: 0 Perl: /usr/bin/perl --------------- Extensions --------------- PostGIS Raster: enabled PostGIS Topology: enabled SFCGAL support: disabled Address Standardizer support: enabled -------- Documentation Generation -------- xsltproc: /usr/bin/xsltproc xsl style sheets: /usr/share/xml/docbook/stylesheet/nwalsh dblatex: /usr/bin/dblatex convert: /usr/bin/convert mathml2.dtd: /usr/share/xml/schema/w3c/mathml/dtd/mathml2.dtd
I'm guessing this is a result of recent changes to PostgreSQL 9.5 code base in past 7 or so days.
Change History (10)
comment:1 by , 10 years ago
Priority: | blocker → high |
---|
comment:2 by , 10 years ago
The offending query is this:
SELECT ST_Value(t.rast, 1, 1, 1), ST_Value(t.rast, 1, 10, 10), ST_BandNoDataValue(rast, 1) FROM ( SELECT ST_Reclass( ST_SetValue( ST_SetValue( ST_AddBand( ST_MakeEmptyRaster(100, 100, 10, 10, 2, 2, 0, 0, 0), 1, '32BF', 1, 0 ), 1, 1, 1, 3.14159 ), 1, 10, 10, 2.71828 ), ROW(1, 'a-100]:50-1,(-100-1000]:150-50,(1000-10000]:254-150', '8BUI', 0) ) AS rast ) AS t;
In older versions of PostgreSQL, that would output just one notice. In 9.5 it is outputting the some notice twice.
Full message is
NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster CONTEXT: PL/pgSQL function st_reclass(raster,reclassarg[]) line 14 at RETURN NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster CONTEXT: PL/pgSQL function st_reclass(raster,reclassarg[]) line 14 at RETURN NOTICE: Invalid argument for reclassargset. Invalid expression of reclassexpr for reclassarg of index 0 . Returning original raster CONTEXT: PL/pgSQL function st_reclass(raster,reclassarg[]) line 14 at RETURN
I was hoping I could reduce this down to a query or function that doesn't involve PostGIS, but have been unable to find one.
comment:3 by , 10 years ago
Okay figure out a simple 1 - will post to bug thread I started:
CREATE OR REPLACE FUNCTION dummy_notice(variadic integer[]) RETURNS integer[] AS $$ BEGIN RAISE NOTICE 'This is a dummy notice'; RETURN $1; END; $$ language plpgsql IMMUTABLE STRICT; SELECT v[1] As v1, v[2] As v2 FROM (SELECT dummy_notice(1,2) As v) As t;
In pre-9.5 this would emit one NOTICE message, in 9.5 it's doubled.
comment:4 by , 10 years ago
posted here -- http://www.postgresql.org/message-id/001c01d0647a$397bb260$ac731720$@pcorp.us
I think this one is serious as it seems the function ST_Reclass is being called for every ST_Value column output (e.g if I add another column to my dummy function, I get another NOTICE).
Hopefully it's something PostgreSQL can fix upstream, otherwise we'll be suffering some pretty serious performance problems particularly with raster, where this pattern of flow is fairly common.
comment:5 by , 10 years ago
Okay lots of discussion going on in this thread on pgsql bugs. It sounds like this was an intentional change and one they are hesitant to revert.
Change here: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f4abd0241de20d5d6a79b84992b9e88603d44134
That said we may need to change the definition of a lot of functions (that we marked as IMMUTABLE) to prevent serious degradation of performance.
Key comment from Tom:
"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes: > This doesn't seem like a solution...if the flattened version of an all > constants invocation cannot be run only once where it could if it was not > flattened I would say the we've introduced a likely (and actual) > performance regression that punishes current valid and idiomatic code. I > haven't gone back and devised the entire reasoning for, and potential > benefit of, the flattening but both this and likely functions returning > composites are being negatively affected by this change. Well, it improves some queries and perhaps punishes others, but I should think the overall effect would generally be a win. Even in the case given here, it's far from clear that allowing flattening is a performance loss; the end result would have been a query containing only constants at run time, which in most scenarios would be a Good Thing. As for claiming that this is broken and should be reverted: nyet. In the first place, there is certainly no PG documentation anywhere that promises single evaluation of a function written in the manner shown here. We do, on the other hand, say that OFFSET 0 creates an optimization fence; so I see nothing wrong with my recommendation. In the second place, this was a HEAD-only change, and we certainly do not promise than the planner never changes behavior in major version updates. regards, tom lane
Only fix thus far to prevent this from running for each value in select (or telling everyone to start using CTE for this kind of thing), is to set these functions as VOLATILE. Even making this STABLE or ramping up the cost for my silly example doesn't prevent the notice from repeating multiple times. Although this change was only trapped by raster, I think we really need to do performance metrics and really study our code base to determine the full impact of this change. I suspect it will make some people's production systems very slow (raster, geography, topology whatever)
comment:6 by , 10 years ago
Component: | raster → postgis |
---|---|
Owner: | changed from | to
Priority: | high → blocker |
Summary: | PostgreSQL 9.5 regress failure on st_reclass → PostgreSQL 9.5 regress failure on st_reclass: possible performance regression for all of PostGIS base |
Switched this back to postgis proper and blocker since this is an issue that requires revisiting of our whole code base if the PostgreSQL 9.5 change is not reverted.
comment:7 by , 10 years ago
Component: | postgis → raster |
---|---|
Owner: | changed from | to
Priority: | blocker → low |
Seems I over-reacted. It appears the change Tom made only changes behavior of inline tables and this redundant call stuff has always been the case for real tables.
For example doing this:
DROP TABLE IF EXISTS t2; CREATE TABLE t2(rast raster); INSERT INTO t2(rast) SELECT ST_SetValue( ST_SetValue( ST_AddBand( ST_MakeEmptyRaster(100, 100, 10, 10, 2, 2, 0, 0, 0), 1, '32BF', 1, 0 ), 1, 1, 1, 3.14159 ), 1, 10, 10, 2.71828 ) As rast; SELECT ST_Value(t.rast, 1, 1, 1), ST_Value(t.rast, 1, 10, 10), ST_BandNoDataValue(rast, 1) FROM ( SELECT ST_Reclass( rast, ROW(1, 'a-100]:50-1,(-100-1000]:150-50,(1000-10000]:254-150', '8BUI', 0) ) AS rast FROM t2 ) AS t;
Forces a call of of ST_Reclass twice even in PostgreSQL 9.4. So only performance degradation would be a constant wrapped in a subquery not using a CTE
So we just need to change this test to have an OFFSET 0 and call it a day.
comment:8 by , 10 years ago
Owner: | changed from | to
---|
comment:9 by , 10 years ago
Fixed at r13386 (2.2) (I am not going to bother backporting to 2.1). Let's just not support 9.5 on 2.1 :)
comment:10 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
On closer inspection this one is a bit weird. Only difference between expected output and 9.5 output is that 9.5 repeats the NOTICE 3 times instead of just once. Does the same on my windows 9.5 install. What on earth would cause that?