Opened 3 years ago
Closed 3 years ago
#5121 closed defect (fixed)
LTO enabled causes windows, freebsd and some github actions to fail
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | blocker | Milestone: | PostGIS 3.3.0 |
Component: | QA/buildbots | Version: | master |
Keywords: | Cc: |
Description
export-all-symbols -Wl,--out-implib=libpostgis-3.3.a lto1.exe: internal compiler error: in gen_subprogram_die, at dwarf2out.c:22668 libbacktrace could not find executable to open Please submit a full bug report, with preprocessed source if appropriate. See <https://sourceforge.net/projects/mingw-w64> for instructions. lto-wrapper.exe: fatal error: C:\ming64gcc81\mingw64\bin\gcc.exe returned 1 exit status compilation terminated. C:/ming64gcc81/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld.exe: error: lto-wrapper failed collect2.exe: error: ld returned 1 exit status make[1]: *** [E:/jenkins/postgresql/rel/pg14w64gcc81/lib/pgxs/src/makefiles/../../src/Makefile.shlib:374: postgis-3.3.dll] Error 1 make[1]: Leaving directory '/projects/postgis/branches/3.3/postgis' make: *** [GNUmakefile:24: all] Error 1
And searching for this on the internet, led me back to my old ticket from 2 years ago #4583
Which Raul kindly pointed out was because of #4754.
@komzpa I recall an LTO commit of yours recently. I admittedly have not been paying too much attention, been ignoring the problem hoping it would go away.
bessie32 is also failing, but could be a different issue
15:52:22 libtool: link: gcc8 -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-trunc -fno-math-errno -fno-signed-zeros -Wall -flto -fPIC -DPIC -I./../rt_core -I./.. -I. -I../.. -I../../liblwgeom -I../../liblwgeom -I/usr/local/include -I/usr/local/include -I/usr/local/include -I/usr/local/include raster2pgsql.o -flto -o raster2pgsql ../rt_core/librtcore.a ../../liblwgeom/.libs/liblwgeom.a -L/usr/local/lib -lm -lproj -ljson-c -lSFCGAL -lgdal -lgeos_c -lintl -liconv 15:52:23 /usr/local/bin/ld: /tmp//cczKohy3.ltrans0.ltrans.o: undefined reference to symbol 'rtrealloc' 15:52:23 /usr/local/bin/ld: /usr/local/lib/librttopo.so.1: error adding symbols: DSO missing from command line 15:52:23 collect2: error: ld returned 1 exit status 15:52:23 gmake[3]: *** [Makefile:86: raster2pgsql] Error 1 15:52:23 gmake[3]: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485/raster/loader' 15:52:23 gmake[2]: *** [Makefile:35: rtloader] Error 2 15:52:23 gmake[2]: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485/raster' 15:52:23 gmake[1]: *** [GNUmakefile:24: all] Error 1 15:52:23 gmake[1]: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485' 15:52:23 *** Error code 2 15:52:23
But bessie (64-bit FreeBSD seems fine)
I still need to confirm I have the same issue on by dev.
Attachments (3)
Change History (20)
by , 3 years ago
Attachment: | debbie-bessie32-consoleText.log added |
---|
comment:1 by , 3 years ago
comment:2 by , 3 years ago
@sergeish,
Thanks for the quick response. I'll test it out on my mingw setup and commit if it works.
comment:3 by , 3 years ago
Okay tested on my mingw setup (my setup is old BTW gcc 8.1 but that is another story)
At anyrate the patch seems to screw up ability to find CC.
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 -I../liblwgeom -I../liblwgeom -std=gnu99 -g -O2 -fno-math-errno -fno-signed-zeros -Wall -flto -I../libpgcommon -I../deps/flatgeobuf -I../deps/wagyu -I../deps/uthash/include -I/projects/geos/rel-3.11w64gcc81/include -IC:/ming64gcc81/projects/proj/rel-7.2.1w64gcc81/include -IC:/ming64gcc81/projects/protobuf/rel-3.2.0w64gcc81/include -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include/libxml2 -I/projects/CGAL/rel-sfcgal-1.4.0w64gcc81/include -IC:/ming64gcc81/projects/json-c/rel-0.12w64gcc81/include/json-c -IC:/ming64gcc81/projects/pcre/rel-8.33w64gcc81/include -DNDEBUG -I/projects/postgresql/rel/pg15w64gcc81/include -I/projects/rel-libiconv-1.16w64gcc81/include -DDLL_EXPORT -DPIC -I. -I./ -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/server -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/internal -I/projects/zlib/rel-zlib-1.2.11w64gcc81/include -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include -I./src/include/port/win32 -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include/libxml2 -IC:/ming64gcc81/projects/lz4/rel-lz4-1.9.3w64gcc81/include -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/server/port/win32 -DWIN32_STACK_RLIMIT=4194304 -c -o postgis_module.o postgis_module.c /bin/sh: line 1: cc: command not found
Looking at the postgis/Makefile generated, it seems to have
CUSTOM_CC := $(CC)
Which I am assuming is the culprit. By comparison, the generated liblwgeom/Makefile has
CC = x86_64-w64-mingw32-gcc
trying to change Makefile.in to below gets me back to the original error
CUSTOM_CC := @CC@
FWIW I think @strk was saying we should get rid of PGXS as it's causing more issues than helping.
comment:4 by , 3 years ago
Thanks @robe.
My conclusion was that PGXS has to be replaced, since it intoduces another set of compilation options that is not fully controlled by user. It seems that LTO cannot be reliably enabled by default before doing that, especially if we need to allow selecting CC in ./configure
like in this failing tests.
Thank you for mentioning @strk's opinion. Would be great if he could comment on this issue.
CUSTOM_CC := $(CC)
was my attempt to make PGXS use the same compiler. Seemed to work in my case, but later I noticed cc
being used as compiler in log.
comment:5 by , 3 years ago
I never liked delegating control to PGXS, the first victim of this was --prefix support which is still an issue after over 11 years: #635
comment:7 by , 3 years ago
The commit above just disables LTO everywhere.
if test "MINGWBUILD" = "0"; then
should be if test "$MINGWBUILD" = "0"; then
.
comment:9 by , 3 years ago
@sergeish
I just noticed that the commit where I accidentally disabled LTO everywhere, we got all green lights on github actions
https://github.com/postgis/postgis/actions/runs/2093940177
So I guess LTO is causing the errors on github too. Does your pull request solve the github issues you know?
comment:10 by , 3 years ago
Summary: | winnie is broken with this strange error lto1.exe: internal compiler error: in gen_subprogram_die, at dwarf2out.c:22668 → LTO enabled causes windows, freebsd and some github actions to fail |
---|
changing the title of this since it seems more involved than just mingw
comment:11 by , 3 years ago
@robe, sorry, I don't understand your question about github issues. Do you mean is there a ticket requesting LTO?
by , 3 years ago
Attachment: | disable_lto_just_for_mingw.png added |
---|
renabled lto for all except mingw
comment:12 by , 3 years ago
@sergeish,
About github actions (not issues). I've added the screen shots to show what I mean. The ticket thing there is a GH pull request which is fine.
When I accidentally disabled LTO for all systems, all GH actions became green. A couple have been red for a while.
# Regina accidentally disabling LTO entirely
When I changed to just disable for mingw, then those went red again though winnie was still happy :)
# Regina changing to just disable for mingw windows
I was baffled with the errors on the GH actions cause they are each different so I thought they were caused by bad docker builds or a change in GDAL.
1) CI (pg14-clang-geosmain-gdal34-proj71, usan_clang) and (pg13-clang-geos39-gdal31-proj71, usan_clang)
couldn't find GDALALL checking for library containing GDALAllRegister... no
Error: Process completed with exit code 1.
2)CI (pg13-geos39-gdal31-proj71, usan_gcc) psql:/src/postgis/regress/00-regress-install/share/contrib/postgis/sfcgal.sql:52: ERROR: could not load library "/src/postgis/regress/00-regress-install/lib/postgis_sfcgal-3.so": /src/postgis/regress/00-regress-install/lib/postgis_sfcgal-3.so: undefined symbol: ubsan_handle_mul_overflow
comment:13 by , 3 years ago
Yes, github action errors seem not related at first glance. I decided to switch the breaking PR from draft because of that.
Unfortunately I still don't have a fix. I'm going to proceed as if the plan is to get rid of PGXS and hopefully find some kind of solution in the process.
Adding LTO flags automatically should probably be disabled for now.
comment:15 by , 3 years ago
PR replacing MINGWBUILD check with --enable-lto option: https://github.com/postgis/postgis/pull/681
comment:16 by , 3 years ago
I come from the opposite side of the pgxs, I feel like it cleared up a lot of alternate problems by anal retentively enforcing a "build your extension just like your server" rule which probably saved us from a lot of really obscure mixed-compiler, fun-platform bugs which we are not taking into our calculations of the "cost of pgxs" because we never ever saw them, because they didn't exist.
Hi,
I made that breaking commit, link to Github PR: https://github.com/postgis/postgis/pull/678
I managed to replicate the issue on x86 FreeBSD 12 by installing gcc8 and gcc10, postgresql13-client (and required libraries), configuring with
../configure CC=gcc8 CXX=g++8 AR=gcc-ar8 RANLIB=gcc-ranlib8 CXXFLAGS='-O2 -pipe -fstack-protector-strong -Wl,-rpath=/usr/local/lib/gcc8 -nostdinc++ -isystem /usr/include/c++/v1 -Wl,-rpath=/usr/local/lib/gcc8' CFLAGS='-Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-trunc' '--with-libiconv=/usr/local' --without-interrupt-tests
(FreeBSD test explicitly sets CC and CXX, but
configure
fails to select correct ar/ranlib, since they are not called gcc-ar/gcc-ranlib but gcc-ar8/gcc-ranlig8 (please see attached log), so I set them explicitly.)The problem is a compiler version (LTO version) mismatch between selected gcc8 and gcc10 in PGXS
Makefile.global
(/usr/local/lib/postgresql/pgxs/src/Makefile.global
). I tried to fix this by settingCUSTOM_CC
before including pgxs.mk (/usr/local/lib/postgresql/pgxs/src/makefiles/pgxs.mk) and that allowed to build the extensions. PR draft: https://github.com/postgis/postgis/pull/679This is not quite a solution since CFLAGS in
Makefile.global
in this case still contains-Wl,-rpath=/usr/local/lib/gcc10
and cannot be overwritten (but flags can be appended by setting CUSTOM_COPTS or PG_CFLAGS) andpostgis-3.so
has incorrect/usr/local/lib/gcc10
runpath.I haven't tried building with MinGW yet.
That's as far as I could get for now, will appreciate any help.