Opened 17 years ago
Last modified 5 years ago
#462 reopened defect
GDAL FDO provider stability issues (patch)
Reported by: | zspitzer | Owned by: | brucedechant |
---|---|---|---|
Priority: | medium | Milestone: | 4.0 |
Component: | Rendering Service | Version: | |
Severity: | blocker | Keywords: | |
Cc: | stevedang, brucedechant, haris | External ID: |
Description
I have just had a 2.0 RC4 server crash and was forced to reboot the entire server running with a lot of other applications.
I was playing with a tif via gdal and the server crashed with the following stack trace. the server remained up after afterwards, but the service would not respond to a stop request and was left hung, nothing is written to the error log afterward
I have seen similiar behaviour with 1.2, but i wasn't able to reproduce the error reliably
<2008-02-26T17:59:07> Administrator
Error: A file IO exception occurred: C:\Program Files\MapGuideOpenSource2.0\Server\Repositories\TileCache\a00059a0-ffff-ffff-8000-00188b8b4aed_en_7F0000010B060B050B04_MapDefinition3/S3/Base Layer Group/R0/C0/3_0.png StackTrace:
- MgTileServiceHandler.ProcessOperation line 83 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\tile\TileServiceHandler.cpp
- MgOpGetTile.Execute line 150 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\tile\OpGetTile.cpp
- MgServerTileService.GetTile line 263 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\tile\ServerTileService.cpp
- MgByteSink::ToFile line 245 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\common\foundation\Data/ByteSink.cpp
- MgByteSink.ToFile line 220 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\common\foundation\Data/ByteSink.cpp A file IO exception occurred: C:\Program Files\MapGuideOpenSource2.0\Server\Repositories\TileCache\a00059a0-ffff-ffff-8000-00188b8b4aed_en_7F0000010B060B050B04_MapDefinition3/S3/Base Layer Group/R0/C0/3_0.png
<2008-02-26T17:59:09> Administrator
Error: Failed to stylize layer: True Marble
An unclassified exception occurred.
- MgMappingUtil.StylizeLayers line 781 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\mapping\MappingUtil.cpp Failed to stylize layer: True Marble
An unclassified exception occurred. <2008-02-26T17:59:10> Administrator
Error: Failed to stylize layer: True Marble
An unclassified exception occurred.
- MgMappingUtil.StylizeLayers line 781 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\mapping\MappingUtil.cpp Failed to stylize layer: True Marble
An unclassified exception occurred.
Attachments (4)
Change History (34)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Wow, what ugly formatting, let's try again. Change the line that reads
DataConnectionPoolSizeCustom =
to
DataConnectionPoolSizeCustom = OSGeo.Gdal:1
comment:3 by , 17 years ago
Just tried that, seems better, but I'm still getting this error, after which most layers (sdf, not raster) don't render properly until i restart the service
The service hasn't locked up again since the change
<2008-02-27T13:07:09> Administrator
Error: Failed to stylize layer: LayerDefinition35
An unclassified exception occurred.
- MgMappingUtil.StylizeLayers line 781 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_22.4\mgdev\server\src\services\mapping\MappingUtil.cpp Failed to stylize layer: LayerDefinition35
An unclassified exception occurred.
comment:4 by , 17 years ago
And then i am seeing
Cannot create any more connections to the OSGeo.Gdal FDO provider.
comment:5 by , 17 years ago
The connection problem has been fixed by submission r2978 (submitted after RC4 came out)
comment:6 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:7 by , 17 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Still occurring with the final 2.0.0 release version
<2008-03-04T02:09:12> Anonymous
Error: A file IO exception occurred: C:\Program Files\MapGuideOpenSource2.0\Server\Repositories\TileCache\RASTER_True Marble_True Marble Tiled EPSG 4283/S3/True Martble/R120/C150/26_12.png StackTrace:
- MgTileServiceHandler.ProcessOperation line 83 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\TileServiceHandler.cpp
- MgOpGetTile.Execute line 150 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\OpGetTile.cpp
- MgServerTileService.GetTile line 263 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\ServerTileService.cpp
- MgByteSink::ToFile line 245 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\common\foundation\Data/ByteSink.cpp
- MgByteSink.ToFile line 220 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\common\foundation\Data/ByteSink.cpp A file IO exception occurred: C:\Program Files\MapGuideOpenSource2.0\Server\Repositories\TileCache\RASTER_True Marble_True Marble Tiled EPSG 4283/S3/True Martble/R120/C150/26_12.png
<2008-03-04T09:16:13> Anonymous
Error: An unclassified exception occurred. StackTrace:
- MgTileServiceHandler.ProcessOperation line 83 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\TileServiceHandler.cpp
- MgOpGetTile.Execute line 150 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\OpGetTile.cpp
- MgServerTileService.GetTile line 263 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\tile\ServerTileService.cpp
- MgServerRenderingService.RenderTile line 220 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\rendering\ServerRenderingService.cpp
- MgServerFeatureReader.Close line 845 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\feature\ServerFeatureReader.cpp An unclassified exception occurred.
<2008-03-04T09:16:14> Anonymous
Error: Failed to stylize layer: True Marble Aust 250m epsg 4283 B
Cannot create any more connections to the OSGeo.Gdal FDO provider.
- MgMappingUtil.StylizeLayers line 776 file d:\buildforgeprojects\mapguide_open_source_v2.0\build_23.8\mgdev\server\src\services\mapping\MappingUtil.cpp Failed to stylize layer: True Marble Aust 250m epsg 4283 B
Cannot create any more connections to the OSGeo.Gdal FDO provider.
comment:8 by , 17 years ago
Summary: | Server Crash and server unresponsive → Server Crash and Unresponsive using OSGeo.Gdal FDO provider |
---|
comment:9 by , 16 years ago
Version: | 2.0.0 → 2.0.2 |
---|
Problem still occurs with 2.0.2 release & tiled maps
comment:10 by , 16 years ago
Cc: | added |
---|
We did stability tests with the GDAL provider and GeoTIFF and found that it was stable. Can you try your test again with GeoTIFF (you can use the GDAL utilities to convert to this format) and see if the problem occurs again? We know of a memory leak, but it's only a few KB at a time so it should take a while before that causes memory problems.
comment:11 by , 16 years ago
test case is here which will crash 2.0.2
you will need to tweak the path for the raster file which is unmanaged and is defined as being under c:\data\raster\
http://ennoble.dreamhosters.com/tests//raster_ticket_462.mgp.zip
comment:12 by , 16 years ago
Hi Zac, please add the steps to recreate the problem once the package is loaded.
comment:13 by , 16 years ago
just open up the basic layout and start to zoom in and pan around,
it doesn't take long for mapguide to stop rendering tiles
comment:14 by , 16 years ago
Thanks Zac, I couldn't reproduce this using IE, but once I moved to using Google Chrome, a problem showed up right away. I only needed to do a single zoom and then pan. Steve, this is pretty easy to reproduce. Come see me if you want to see it. The server continues to work though, as I can continue to get in through Studio; it seems that the GDAL provider is hooped because I can't do anything with raster.
This is the error that I get:
<2008-11-25T17:34:24> Ajax Viewer 144.111.170.90 Anonymous Error: Failed to stylize layer: TrueMarble.8km.5400x2700 layer An unclassified exception occurred. StackTrace: - MgMappingUtil.StylizeLayers line 786 file d:\build\mapguide_open_source_v2.0\build_30.11\mgdev\server\src\services\mapping\MappingUtil.cpp Failed to stylize layer: TrueMarble.8km.5400x2700 layer An unclassified exception occurred.
by , 16 years ago
Attachment: | mapguide_raster_unalloc.5.patch added |
---|
by , 16 years ago
Attachment: | mapguide_raster_stability.patch added |
---|
comment:15 by , 16 years ago
There are (at least) three problems with raster in 2.0.2.
The first is a memory leak that has since been fixed in 2.1.
The second is (my non-technical description) writing to unallocated memory. (see mapguide_raster_unalloc.5.patch)
The third is a defect in the way that MapGuide deals with single-threaded providers. The attachment mapguide_raster_stability.patch provides a workaround for this defect in conjunction with the GDAL provider.
However, this could conceptually happen with other providers. Haris is looking into the problem more in depth, but in the meantime explained the problem to me as follows, referencing the code around Line 660 of MappingUtil.cpp :
Assume two Raster layers accessed at same time.
- Raster connection to Layer 1 created
- ExecuteRasterQuery executed, class Georaster created which keeps pointer to connection ( not adding ref count)
- That threads goes into Stylize Layers
- Second thread goes into ExecuteRasterQuery, but is accessing another raster layer so can't use the same connection
- Second thread creates new connection to raster provider, but because the pool size for single-threaded providers is limited to 1 (and also because gdal provider didn't ref count++) the connection manager deletes the first thread's connection
- First thread which now in StylizeGridLayer finds that its connection was deleted and the pointer is gone
Result: Exception and corrupted memory
comment:16 by , 16 years ago
Cc: | added |
---|
comment:17 by , 16 years ago
If the FdO connection manager is deleting a connection in use that is a bug. Let me debug the server to see what is happening because we only see this issue with GDAL. I would like to know where the bug is - either in the handling of the single threaded providers or if it is just GDAL not referencing counting properly.
comment:18 by , 16 years ago
Cc: | added |
---|
by , 16 years ago
Attachment: | grfp_addref.patch added |
---|
GDAL raster provider patch to addref the FDO connection from the FeatureReader
by , 16 years ago
Attachment: | gdal_nomutex_cacheschema.patch added |
---|
Fix refcounting problem + cache schema
comment:19 by , 16 years ago
Milestone: | 2.0 → 2.1 |
---|
comment:20 by , 16 years ago
Owner: | set to |
---|---|
Status: | reopened → new |
comment:21 by , 16 years ago
Status: | new → assigned |
---|
comment:22 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed.
See changeset r3829.
comment:23 by , 13 years ago
Component: | General → Rendering Service |
---|---|
Milestone: | 2.1 → 2.4 |
Resolution: | fixed |
Status: | closed → reopened |
Summary: | Server Crash and Unresponsive using OSGeo.Gdal FDO provider → GDAL FDO provider stability issues (patch) |
Version: | 2.0.2 |
re-opening as there are useful patches which haven't been applied yet
comment:26 by , 13 years ago
Specifically the MapGuide source patches. FDO source patches should be added to the linked FDO trac ticket directly.
comment:27 by , 12 years ago
Milestone: | 2.4 → 2.5 |
---|
comment:28 by , 12 years ago
Milestone: | 2.5 → 2.6 |
---|
Please try editing your serverconfig.ini file and change the line that reads DataConnectionPoolSizeCustom = to DataConnectionPoolSizeCustom = OSGeo.Gdal:1
and report this resolves your problem. Thanks, Tom