Opened 15 months ago
Closed 15 months ago
#5467 closed defect (fixed)
woodie fails on raster compile for PG 14, 15
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | critical | Milestone: | PostGIS 3.4.0 |
Component: | QA/buildbots | Version: | master |
Keywords: | Cc: |
Description
Woodie seems to fail often on PG 14, 15 though PG 12 seems always fine.
Note sure when this started.
I think it was going on before I upgraded woodie.
Error seems to be some parallel race condition:
make[3]: Leaving directory '/tmp/postgis-build/15/libpgcommon' make[2]: Leaving directory '/tmp/postgis-build/15/raster/rt_pg' make[1]: *** [Makefile:34: pglib] Error 2 make[1]: *** Waiting for unfinished jobs.... /bin/bash ../../libtool --mode=link gcc -std=gnu99 -O2 -Wall -fno-omit-frame-pointer -Werror -fno-math-errno -fno-signed-zeros -Wall -flto -fPIC -DPIC -I/woodpecker/src/git.osgeo.org/gitea/postgis/postgis/raster/loader/../rt_core -I./.. -I. -I../.. -I../../liblwgeom -I/woodpecker/src/git.osgeo.org/gitea/postgis/postgis/liblwgeom -I/usr/local/include -I/usr/local/include ../rt_core/librtcore.a raster2pgsql.o -lm -flto ../../liblwgeom/liblwgeom.la -L/usr/local/lib -lgdal -L/usr/local/lib -lgeos_c -lc -o raster2pgsql rm -f rtpostgis.sql.tmp make[2]: Leaving directory '/tmp/postgis-build/15/raster/rt_pg' libtool: link: gcc -std=gnu99 -O2 -Wall -fno-omit-frame-pointer -Werror -fno-math-errno -fno-signed-zeros -Wall -flto -fPIC -DPIC -I/woodpecker/src/git.osgeo.org/gitea/postgis/postgis/raster/loader/../rt_core -I./.. -I. -I../.. -I../../liblwgeom -I/woodpecker/src/git.osgeo.org/gitea/postgis/postgis/liblwgeom -I/usr/local/include -I/usr/local/include raster2pgsql.o -flto -o raster2pgsql ../rt_core/librtcore.a ../../liblwgeom/.libs/liblwgeom.a -lm -L/usr/local/lib /usr/local/lib/libgeos.so /usr/lib/x86_64-linux-gnu/libproj.so -ljson-c -L/usr/lib/x86_64-linux-gnu -lSFCGAL -lgmpxx /usr/local/lib/libgdal.so /usr/local/lib/libgeos_c.so -lc make[2]: Leaving directory '/tmp/postgis-build/15/raster/loader' make[1]: Leaving directory '/tmp/postgis-build/15/raster'
What unfinished jobs?
The other confusing thing is dronie which is testing PG13 never fails either even though she's using the same shell scripts and docker images as woodie.
So maybe whatever this issue is only is an issue on PG 14, PG 15, possibly PG16 which we aren't testing yet and not an issue for PG 12 and PG 13
Change History (6)
comment:1 by , 15 months ago
comment:2 by , 15 months ago
Example failure is here: https://woodie.osgeo.org/repos/30/pipeline/769/14
The fatal error is on this line 673: https://woodie.osgeo.org/repos/30/pipeline/769/14#L673
Error message is:
make[2]: *** [Makefile:152: rtpostgis_upgrade.sql.in] Error 255
The line which is creating rtpostgis_upgrade.sql.in appears before, on line 657: https://woodie.osgeo.org/repos/30/pipeline/769/14#L657
An error message which seems the possible cause of failure is on line 670:
https://woodie.osgeo.org/repos/30/pipeline/769/14#L670
Unable to locate target new version number in rtpostgis.sql
The rtpostgis.sql
target is a dependency of rtpostgis_upgrade.sql.in
target so I guess the make error report is just a bit confusing.
Now, for the reason why rtpostgis.sql is found broken, there must be multiple processes creating it at the same time. looking at raster/Makefile.in I see the all
rule has 4 dependencies which means we don't control the order in which they are built, in turn the RT_LOADER target has a dependency on librtcore which builds it again, which conflicts with an eventual parallel job also doing that. I'll try cleaning that up and see what happens.
comment:4 by , 15 months ago
Woodie is troubled, it keeps getting stuck so it's hard to tell if the bug was fixed. See https://trac.osgeo.org/osgeo/ticket/2961
BTW, the bug preventing all steps from being visible was fixed upstream, so it would be good to upgrade: https://github.com/woodpecker-ci/woodpecker/issues/2178
comment:5 by , 15 months ago
Oh I did upgrade woodie clients and woodie server to latest but sadly things are still broken. Not sure what happened why things were even working before and now they are not
comment:6 by , 15 months ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I think this went away with strk's control raster compile order.
hmm but it doesn't always fail, pull request one passed
https://woodie.osgeo.org/repos/30/pipeline/761/8
seems to mostly fail when the doc job is triggered, maybe it's a race condition of containers running at same time, but that doesn't make sense.