#1617 closed defect (fixed)
[raster] several regress failures on raster (old mingw)
Reported by: | robe | Owned by: | Bborie Park |
---|---|---|---|
Priority: | blocker | Milestone: | PostGIS 2.0.0 |
Component: | raster | Version: | master |
Keywords: | mingw | Cc: |
Description
IT's been a while since I've tested raster since before the core postgis tests were failing. Now that all those past after Paul's fixes -- thanks Paul :)
I'm seeing failures in raster. Also for some reason my temp folder is not being numbered anymore. It's putting the results in Appdata\Local\Temp\pgis_reg instead of Appdata\Local\Temp\pgis_reg_somerandomnumber
PostgreSQL 9.1.2, compiled by Visual C++ build 1500, 32-bit Postgis 2.0.0alpha7SVN - r - 2012-02-26 02:29:02 GEOS: 3.3.3dev-CAPI-1.7.3 PROJ: Rel. 4.6.1, 21 August 2008 GDAL: GDAL 1.9.0, released 2011/12/29 Running tests check_raster_columns .. ok check_raster_overviews .. ok rt_io .. ok rt_bytea .. ok box3d .. ok rt_addband .. ok rt_band .. ok rt_asgdalraster .. failed (diff expected obtained: /tmp/pgis_reg/test_8_diff) rt_astiff .. failed (diff expected obtained: /tmp/pgis_reg/test_9_diff) rt_asjpeg .. failed (diff expected obtained: /tmp/pgis_reg/test_10_diff) rt_aspng .. ok rt_union .. ok create_rt_properties_test .. ok rt_dimensions .. ok rt_scale .. ok rt_pixelsize .. ok rt_upperleft .. ok rt_rotation .. ok rt_georeference .. ok rt_set_properties .. ok drop_rt_properties_test .. ok create_rt_empty_raster_test .. ok rt_isempty .. ok rt_hasnoband .. ok drop_rt_empty_raster_test .. ok rt_metadata .. ok create_rt_band_properties_test .. ok rt_band_properties .. ok rt_set_band_properties .. ok rt_summarystats .. ok rt_count .. ok rt_histogram .. ok rt_quantile .. ok rt_valuecount .. ok rt_valuepercent .. ok rt_bandmetadata .. ok rt_pixelvalue .. ok drop_rt_band_properties_test .. ok rt_utility .. ok create_rt_mapalgebra_test .. ok rt_mapalgebraexpr .. ok rt_mapalgebrafct .. ok rt_mapalgebraexpr_2raster .. ok rt_mapalgebrafct_2raster .. ok drop_rt_mapalgebra_test .. ok create_rt_mapalgebrafctngb_test .. ok rt_mapalgebrafctngb .. ok rt_mapalgebrafctngb_userfunc .. ok drop_rt_mapalgebrafctngb_test .. ok rt_reclass .. ok rt_resample .. ok rt_asraster .. ok rt_intersection .. ok rt_clip .. ok create_rt_gist_test .. ok rt_above .. ok rt_below .. ok rt_contained .. ok rt_contain .. ok rt_left .. ok rt_overabove .. ok rt_overbelow .. ok rt_overlap .. ok rt_overleft .. ok rt_overright .. ok rt_right .. ok rt_same .. ok drop_rt_gist_test .. ok rt_spatial_relationship .. ok rt_intersects .. ok rt_samealignment .. ok bug_test_car5 .. ok tickets .. ok loader/Basic .... failed ( test: actual SQL does not match expected.,: /tmp/pgis_reg/loader.out) .. ok loader/BasicCopy .... failed ( test: actual SQL does not match expected.,: /tmp/pgis_reg/loader.out) .. ok loader/Tiled10x10 ..... ok loader/Tiled10x10Copy ..... ok uninstall ... ok (3855) Run tests: 78 Failed: 5
The rt_asgdalraster is crashing my service. I suspect astiff and asjpeg are failing becasue they are being running while the pg service is starting up.
I've attached the regress folder
Attachments (1)
Change History (24)
by , 13 years ago
Attachment: | pgis_reg.zip added |
---|
comment:1 by , 13 years ago
Keywords: | mingw added |
---|
comment:2 by , 13 years ago
comment:3 by , 13 years ago
I'm guessing it has something to do with the change I had to make for the 8BSI pixel type. 8BSI used to map to GDAL pixel type GDT_Byte but now maps to GDT_Int16 to preserve the sign.
What version of GDAL are you using?
comment:4 by , 13 years ago
released 1.9.0 compiled under mingw, but Paul might be using trunk and I think his is compiled under VC 2008. Paul which GDAL are you using?
comment:6 by , 13 years ago
Summary: | several regress failures on raster (old mingw) → [raster] several regress failures on raster (old mingw) |
---|
Can either of you test out raster/test/core/testapi.c from 9303? I've added another test of the GDAL PNG output. I'm expecting that to cause a crash in Windows. I hope that will make the crash easier to debug.
I'm not seeing this issue in Linux 32 and 64-bit (GDAL trunk r24025) and OSX (GDAL release 1.9.0), which makes things even stranger.
follow-up: 17 comment:7 by , 13 years ago
It does crash, but unfortunately the stack trace is just as useless (perhaps because I'm built with minimal debugging on my dependent libraries, or perhaps just because I'm not good looking enough)
Program received signal SIGSEGV, Segmentation fault. 0x7855ae7a in memcpy () from C:\WINDOWS\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.6161_x-ww_31a54e43\msvcr90.dll (gdb) bt #0 0x7855ae7a in memcpy () from C:\WINDOWS\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.6161_x-ww_31a54e43\msvcr90.dll #1 0x006eb768 in gdal!?IReadBlock@MEMRasterBand@@UAE?AW4CPLErr@@HHPAX@Z () from c:\pgsql\bin\gdal.dll Backtrace stopped: Not enough registers or memory available to unwind further (gdb)
comment:8 by , 13 years ago
The key to most windows-specific problems seems to be dirty memory, in my experience thus far. For whatever reason, the odds that a an address will be zero'ed when you first read from it seem much higher in Linux/OSX. So check that all your variables have initializers, and that you don't assume malloc'ed space will be zero'ed out.
comment:9 by , 13 years ago
On mine it says its successful
make -C test check make[1]: Entering directory `/c/projects/PostGIS/trunk/raster/test' make -C core check make[2]: Entering directory `/c/projects/PostGIS/trunk/raster/test/core' ./testapi Warning 6: PNG driver doesn't support data type Int16. Only eight bit (Byte) and sixteen bit (UInt16) bands supported. Defaulting to Byte Checking empty and hasnoband functions... Checking raster properties... Raster starts with 0 bands First point on convexhull ring is 0.5,0.5 Second point on convexhull ring is 256.5,1280.5 Third point on convexhull ring is 1280.5,1536.5 Fourth point on convexhull ring is 1024.5,256.5 Fifth point on convexhull ring is 0.5,0.5 Testing rt_raster_gdal_polygonize Successfully tested rt_raster_gdal_polygonize Testing 1BB band Testing 2BB band Testing 4BUI band Testing 8BUI band Testing 8BSI band Testing 16BSI band ERROR: rt_band_set_pixel: Coordinates out of range Testing 16BUI band ERROR: rt_band_set_pixel: Coordinates out of range Testing 32BUI band ERROR: rt_band_set_pixel: Coordinates out of range Testing 32BSI band ERROR: rt_band_set_pixel: Coordinates out of range Testing 32BF band Testing 64BF band Testing band hasnodata flag Testing rt_raster_from_band Successfully tested rt_raster_from_band Testing band stats Successfully tested band stats Testing rt_raster_replace_band Successfully tested rt_raster_replace_band Testing rt_band_reclass Successfully tested rt_band_reclass Testing rt_raster_to_gdal Successfully tested rt_raster_to_gdal Testing rt_raster_gdal_drivers Successfully tested rt_raster_gdal_drivers Testing rt_band_get_value_count Successfully tested rt_band_get_value_count Testing rt_raster_from_gdal_dataset Successfully tested rt_raster_from_gdal_dataset Testing rt_util_compute_skewed_extent Successfully tested rt_util_compute_skewed_extent Testing rt_raster_gdal_warp Successfully tested rt_raster_gdal_warp Testing rt_raster_gdal_rasterize Successfully tested rt_raster_gdal_rasterize Testing rt_raster_intersects Successfully tested rt_raster_intersects Testing rt_raster_same_alignment Successfully tested rt_raster_same_alignment Testing rt_raster_from_two_rasters ERROR: rt_raster_from_two_rasters: The two rasters provided do not have the same alignment ERROR: rt_raster_from_two_rasters: The two rasters provided do not have the same SRID ERROR: rt_raster_from_two_rasters: The two rasters provided do not have the same alignment Successfully tested rt_raster_from_two_rasters Testing rt_raster_load_offline_band Successfully tested rt_raster_load_offline_band ./testwkb in hexwkb len: 122 out hexwkb len: 122 in hexwkb: 00000000003FF0000000000000400000000000000040080000000000004010000000000000401400000000000040180000000000000000000A00070008 out hexwkb: 0100000000000000000000F03F000000000000004000000000000008400000000000001040000000000000144000000000000018400A00000007000800 in hexwkb len: 128 out hexwkb len: 128 in hexwkb len: 138 out hexwkb len: 138 in hexwkb len: 152 out hexwkb len: 152 in hexwkb len: 152 out hexwkb len: 152 ext band path: /tmp/t.tif ext band num: 3 in hexwkb len: 152 out hexwkb len: 152 SRID value -1 converted to the officially unknown SRID value 0 SRID value -1 converted to the officially unknown SRID value 0 in hexwkb len: 284 out hexwkb len: 284 SRID value -1 converted to the officially unknown SRID value 0 in hexwkb len: 284 out hexwkb len: 284 SRID value -1 converted to the officially unknown SRID value 0 in hexwkb len: 284 out hexwkb len: 284 SRID value -1 converted to the officially unknown SRID value 0 in hexwkb len: 284 out hexwkb len: 284 SRID value -1 converted to the officially unknown SRID value 0 in hexwkb len: 284 out hexwkb len: 284 All tests successful !
}}}
comment:10 by , 13 years ago
I agree with Paul though. One reason I'm such a good tester is because windows crashes whenever the memory is dirty in any way and the combination of windows 32 app running on windows 7 64-bit seems to be even better at doing that. So I could consistently cause a crash on those dirty memory array bugs that would take others 10 cycles or more to produce.
comment:11 by , 13 years ago
Paul,
So are you saying the testapi test is crashing for you? If that is the case might be the interaction between vcc and native ming. In my case the testapi would be a pure mingw gdal test since my gdal is compiled under mingw and that test doesn't touch postgresql. So I wouldn't see the interaction until my PostgreSQL test which is a test against a PostgreSQL VC++ build.
So could be at the point where it is trying to output the error something is not being closed right.
comment:12 by , 13 years ago
Yes, testapi crashes, and from the stack trace it looks to crash in the same place as the online test. I find it hard to believe your VC postgresql could have anything to do with it, so perhaps you could try your online test with a mingw postgresql and see if it works there to prove me wrong.
comment:13 by , 13 years ago
Bborie, you could pass the testapi.c through valgrind and look for uninitialized variables and other memory nastinesses.
comment:14 by , 13 years ago
Well I get those 3 regress failures still with the test_8_out (rt_asgdalraster) still giving a server crash notice.
However when I run this
SELECT CASE WHEN length(ST_AsGDALRaster(ST_AddBand(ST_MakeEmptyRaster(200, 200, 10, 10, 2, 2, 0, 0), 1, '8BSI',123, NULL),'PNG')) > 0 THEN 1 ELSE 0 END SELECT version() , postgis_full_version(); PostgreSQL 9.1.0 on i686-pc-mingw32, compiled by gcc.exe (GCC) 3.4.5 (mingw-vista special r3), 32-bit POSTGIS="2.0.0alpha7SVN" GEOS="3.3.3dev-CAPI-1.7.3" PROJ="Rel. 4.6.1, 21 August 2008" GDAL="GDAL 1.9.0, released 2011/12/29" LIBXML="2.7.8" USE_STATS
Under my mingw compiled postgresql it doesn't crash and gives 1. I'll double check on my VC++ build to make sure it still crashes and it consistently crashes. Then I'll recompile my mingw postgres with more debug things enabled if needed so I can troubleshoot the other crash.
comment:15 by , 13 years ago
Okay confirmed. mingw on mingw doesn't crash on Paul's sample query
mingw with vcc+ compiled PostgreSQL consistently crashes on Paul's sample query.
I still haven't had a chance to troubleshoot the rgdal crash that seems to happen on both, so I guess that one might be a separate issue.
comment:16 by , 13 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Gah! No need to do additional testing. I know what's causing it. PostGIS Raster has the pixel type 8BSI. GDAL does not support 8BSI so we use the GDAL pixel type GDT_Int16. When converting a raster to a GDAL MEM dataset, we use a pointer to the location of the pixel data. But, there is a mismatch as GDAL is expecting a block of data in 16-bit signed integer when the data is in 8-bit signed integer. And thus the memcpy messages pramsey was getting in gdb. GDAL was expecting the data block to be twice the size of what is in the raster.
So, that's the problem. It only affects the 8BSI pixel type as all other pixel-types have a clean one-to-one match. I hope to have a fix committed sometime today or in the worst case, tomorrow.
comment:17 by , 13 years ago
Replying to pramsey:
It does crash, but unfortunately the stack trace is just as useless (perhaps because I'm built with minimal debugging on my dependent libraries, or perhaps just because I'm not good looking enough)
Program received signal SIGSEGV, Segmentation fault. 0x7855ae7a in memcpy () from C:\WINDOWS\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.6161_x-ww_31a54e43\msvcr90.dll (gdb) bt #0 0x7855ae7a in memcpy () from C:\WINDOWS\WinSxS\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.6161_x-ww_31a54e43\msvcr90.dll #1 0x006eb768 in gdal!?IReadBlock@MEMRasterBand@@UAE?AW4CPLErr@@HHPAX@Z () from c:\pgsql\bin\gdal.dll Backtrace stopped: Not enough registers or memory available to unwind further (gdb)
Ah yes, you're probably getting caught by the DWARF2 vs. SJLJ exception handling fun. GCC uses DWARF2 while MSVC uses SJLJ - hence if you're mixing across the two, your stack traces will stop at the point where you switch.
I think personally a better idea based upon your email would be for OSGEO to host a set of pre-built Windows DLLs and library headers so that people can quickly grab a tarball/zip file to get involved with development. It should be fairly easy to build everything consistently on mingw, and then everything would "just work".
comment:18 by , 13 years ago
Can someone test r9313? I've fixed the issue regarding PT_8BUI -> GDT_Int16 and am expecting that these regressions should no longer exist.
comment:20 by , 13 years ago
Online tests now get past the GDAL failures, loader/Basic and loader/BasicCopy still remain in the raster tests... great work, bborrie!
comment:21 by , 13 years ago
Committed a fix to run_test to get around the Basic/BasicCopy failures, now running online tests again... can we get past raster?...
comment:22 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Windows regresses all the way to completion. We have reached the promised land.
Bborie, As I noted in http://www.postgis.org/pipermail/postgis-devel/2012-February/018849.html (I think Paul's recent failure and this might be related).
I tried Paul's example
Works fine with my alpha6 build, but crashes with trunk. I looked at my postgresql logs and this is the last message it gives before it crashes the backend.
So maybe related to your fix for #1616 ?