Version 13 (modified by 13 years ago) ( diff ) | ,
---|
MapGuide RFC 116 - Coordinate System Conversion Performance Upgrade
This page contains an change request (RFC) for the MapGuide Open Source project. More MapGuide RFCs can be found on the RFCs page.
Status
RFC Template Version | (1.0) |
Submission Date | 16 June 2011 |
Last Modified | Norm Olsen 16 June 2011` |
Author | Norm Olsen |
RFC Status | Ready For Review |
Implementation Status | pending |
Proposed Milestone | 2.3 |
Assigned PSC guide(s) | Bruce Dechant |
Voting History | (vote date) |
+1 | |
+0 | |
-0 | |
-1 | |
no vote |
Overview
Geodetic coordinate system conversion is involved in a large portion of MapGuide data processing at both the server and client levels. Presuming the acceptance and implementation of CS-MAP RFC #5, it is proposed that the MgCoordinateSystem Application Programmers Interface (API) be enhanced to increase the performance of coordinate conversions at both the server and client levels of MapGuide without changing the signature or behavior of any existing member of the API.
Motivation
Regardless of how fast anything on the Internet is, it is never fast enough. Hardware available at both the server and client level is now multi-core capable thus enabling higher performance. Software developments such as cloud computing provide the demand for higher performance. The changes proposed in this RFC are intended to address these issues.
Proposed Solution
The proposed submission would not change any existing method signatures or change any behavior in a substantial way. The proposal introduces five new members to the currently existing MgCoordinateSystemTransform interface. This RFC includes an outline of the recommended usage of the API which will provide the optimum performance of the API.
Using the metric of the number of conversions per second from UTM27-13 to CO83-C in a pure measurement environment (i.e. no coordinate retrieval or delivery code), the underlying CS-MAP library is capable of producing approximately 1 million conversions per second on an average desktop machine. Changes in the MapGuide API, therefore, cannot get us beyond this limit. Thus, in this RFC, we will write of performance in terms of the percentage of this theoretical maximum which the API can/will deliver. The current implementation of the API delivers performance of approximately 80% of this maximum. Research and test implementations indicate that it is not unreasonable to expect an improvement to 91% of the theoretical maximum when using the most efficient of the Transform function overloads.
Achieving this level of improvement is deemed possible by four distinct tasks involving the API.
Removing the Requirement for a Critical Section
Currently, there are specific transformations within the CS-MAP library which are not threadsafe. Therefore, to insure proper operation in a multi-threaded environment, a critical section is activated for all datum shift calculations. It is the intent of this !RFC to remove this requirement. Assuming the acceptance and implementation of OsGeo MetaCRS RFC #5, CS-MAP will enable the API to query CS-MAP and determine if a critical section is necessary for a specific transformation. The API will be modified to use this information and invoke the critical section only as necessary.
Please note that the referenced CS-MAP RFC includes two phases. The first phase includes only the identification of those transformations which are known to be reentrant and those which are not. The first phase is a relatively simple task which is easily completed with available resources. The second phase includes some, perhaps non-trivial, efforts to make threadsafe most, if not all, CS-MAP conversions and transformations which are known to be non-threadsafe. Thus, as this work progresses, further performance enhancements will inure to applications without and code changes required.
Refactor the MgCoordinateSystemTransform::Transform Functions
It is proposed that the existing implementation of all of the Transform overloads in the existing MgCoordinateSystemTransform object be refactored for optimum performance purposes. It is contemplated that by: a) reducing the number of changes the form of a coordinate takes, b) removing some internal function calls by replicating code to a small degree, and c) reducing the overhead implied by several layers of try {} catch blocks; that significant performance enhancements can be achieved. This work will introduce some minor changes in behavior which are considered to be improvements in consistency and usefulness of the API and only affects behavior in extraordinary cases. These changes are detailed below.
Conversion Status Accumulation
CS-MAP issues warnings for coordinates outside the useful range of the coordinate systems (and the datums referenced by them) used to construct the MgCoordinateSystemTransform object. These are warnings and do not mean that the returned coordinates are invalid. It should not be considered abnormal for a small sub-set of the coordinates in a large conversion to be outside the useful range of a Transformation object. In the event that a large number of coordinates in a conversion are found to be outside the useful range, it is proper to question the validity of the conversion. Such a case is a strong indication that the user may not have selected the proper coordinate system for a specific conversion.
The default behavior of the API is to throw an exception whenever such a warning is returned by the CS-MAP library. This default behavior can be, and often is, modified at run-time using the IgnoreDatumShiftWarning and IgnoreOutsideDomainWarning members of the MgCoordinateSystemTransform interface. Thus, it is recommended that applications using the API disable the exception throwing behavior of the API. It is further proposed that the MgCoordinateSystemTransform object be enhanced to provide a status accumulation feature. By status accumulation, we refer to the concept of: a)counting all source projective CRS warnings issued, b)counting all datum shift warnings issued, and c)counting all target projective CRS warnings.
Upon construction, or upon use of the SetSourceAndTarget member function, or upon use of the ResetLastTransformStatus member function, all counters of the status accumulation mechanism will be reset to zero. Each point converted by the MgCoordinateSystemTransform object will cause the appropriate counts to be advanced based on the status of the conversion. Upon completion of the conversion of a map or data source, the application would then query the Transform object and make a determination as to the validity of the result.
For example, a conversion where the target CRS warning count exceeds, say, 20% of the total number of points suggests that the target CRS chosen by the user is inappropriate for the data set being converted. On the other hand, warning counts which are less than, say, 20% of the total point count suggest a normal conversion.
Thus, the following additional member functions to the MgCoordinateSystemTransform object are proposed:
INT32 MgCoordinateSystemTransform::GetSourceWarningCount (void);
Returns the number of warning statuses returned by the source projective CRS phase.
INT32 MgCoordinateSystemTransform::GetDatumWarningCount (void);
Returns the number of warning statuses returned by the datum shift phase.
INT32 MgCoordinateSystemTransform::GetTargetWarningCount (void);
Returns the number of warning statuses returned by the target projective CRS phase.
The above new functions can be used at anytime during a large conversion to determine the accumulated status of the conversion. Thus, a very large conversion can be terminated prematurely if it appears that after a significant number of conversions have been performed an ultimate failure appears to be likely.
The addition of this improved status monitoring capability is expected to make the disabling of exception processing while performing large conversions an acceptable practice and is, therefore, considered to be and important contribution to providing higher performance levels.
Provide Additional Batch Coordinate Conversion Capability
A batch coordinate conversion capability currently exists in the MgCoordinateSystemTransform object. The performance of this capability is expected to increase due to the refactoring of the Transform code proposed immediately above. However, this function requires that, for example, 3D coordinates are provided in three distinct arrays; specifically the easting/X/Longitude coordinates in one single dimensional array of doubles, the northing/Y/Latitude coordinates in a separate single dimension array of doubles, and a third separate and distinct array of double for the elevation/Z/height coordinate. There are few, if any, applications which maintain or utilize coordinate data in this form.
Thus, to take advantage of the batch conversion facility currently in place, the traditional form of coordinate data (e.g. a two dimensional array of doubles: double [][3]) has to be reformatted (i.e. marshaled) into the distinct array form prior to conversion, and then reformatted back to the traditional form after the conversion has been performed. Thus, what performance improvement is provided by the batch conversion facility is typically consumed, and probably then some, by the formatting and reformatting processes.
It is, therefore proposed, that two new functions be added to the MgCoordinateSystemTransform object be added which will have signatures suggested by the following:
void MgCoordinateSystemTransform::Transform2D (double [][2],INT32 pointCount); void MgCoordinateSystemTransform::Transform3D (double [][3],INT32 pointCount);
These new member functions would convert the point arrays in place, and do so without the need for reformatting the coordinate storage.
Implications
Critical Section Still Required
It would be nice to assume that all current CS-MAP coordinate conversion algorithms can be made threadsafe without a serious affect on resources and/or performance, and that all future additions to the CS-MAP library will be implemented in a threadsafe manner. However, the ability to have a non-threadsafe conversion/transformation method in the CS-MAP library is reserved. Thus, we retain the Critical Section to keep multiple threads from using a non-threadsafe conversion or transformation at the same time. Given the implementation of CS-MAP RFC #5, however, we will only need to actually use it when truly necessary.
Thread Saftey
The threadsafe behavior of all existing features of the MgCoordinateSystemTransform object remain intact; although it is expected that several minor behavior changes (the author considers them to be improvements) will be made as described immediately below. The new status accumulation feature, however, cannot be made totally threadsafe in the current MapGuide environment due to multi-platform, multi-language, support considerations.
Thus, the choice has been made to require that one distinct and separate MgCoordinateSystemTransform object be created for each thread that needs to use same.
Behavior Modifications
A substantial portion of the increased performance to be achieved will be derived from a refactoring of the coordinate system conversion code. Over the years, this code has become somewhat inefficient using several nested function calls with non-trivial signatures. In refactoring this code, the following changes in behavior (more like corrections) will be made:
- In the existing code, the behavior of the API with regard to the status of results in the event of an exception being thrown is inconsistent. In the proposed code, conversion results will always be provided, even in the event of an exception being thrown. Thus, the proposed behavior will provide consistent return results and also contribute to higher performance levels. That is, even in the event of an exception, all coordinates requested to be converted will have been converted.
- The four status values returned in the m_nTransformStatus member of the MgCoordinateSystemTransform object will be adjusted to form a severity level sequence which rates a geodetic datum “outside range” as more severe than a projected “outside range”. The names used will not change, only the numeric values assigned to them; so this should not require any coding changes.
- The overloads of the MgCoordinateSystemTransform::Transform which deal with arrays will now always complete the conversion of the entire array before throwing any exception with regard to warning status values encountered in the conversion. Also, these overloads will be modified so that the value of the m_nTransformStatus member, upon return, will always reflect the worst status encountered (per the severity level described in 2 above) in the transformation of the array (as opposed to the status of the last conversion performed as is currently done).
- All overloads of the TransformM variety will now always calculate and return the ‘m’ value. Currently, when an exception is thrown, the XYZ coordinate values would be converted, but the ‘m’ value would not always be.
Coordinate results provided in the case of an exception will be what CS-MAP considers to be a "rational result". In the case of a datum shift calculation failure, the rational result is either that calculated by the fallback specification or the unshifted input coordinates (if there is no fallback). This is considered rational as datums shifts are rarely more than 100 meters, and usually in the range of 20 meters. Thus, given that the input coordinate is outside the useful range of the datum shift transformation (typically this means outside the coverage provided by grid shift data files), the result "rational result" is the unshifted input.
In the case of projective conversions, the "rational result" is based on the nature of the projection. For many of the projections supported, the "rational result" is simply what the projection mathematics produce, even though the coordinate is known to be outside the region for which the projection's parameters suggest is the useful range of the conversion. In other cases, the projection will have singularity points, such as either pole in the case of the traditional Mercator. In such cases the "rational result" typically includes one or more ordinates with an unmistakably large value which suggests infinity, but will not cause a floating point exception if the value is used for any normal calculation.
Test Plan
Normal regression testing will apply to insure that existing numerical results are preserved. Additionally, a multi-core specific test module shall be developed which, given a table of test cases specifically constructed to include several different conversion and transformation objects (directly or indirectly), shall:
- Construct for each unique conversion pair in the test data table a MgCoordinateSystemTransform object, copying the pointer as necessary to provide a pointer for each occurrence in the table.
- A function capable of performing all conversions in the table, in sequence, using the MgCoordinateSystemTransform pointer in each test case, shall be written.
- The host test application will create threads causing each individual thread to execute the test conversion function using the exact same test data and, using one distinct and separate MgCoordinateSystemTransform instance per thread.
- The host test application will cause up to 16 threads to be active at any given time.
- Continue this test for a specified amount of time, defaulting to 20 seconds.
- In the event of an error, the test application will record the specific transformation which failed.
Successful operation of 16 randomly started threads for a period of 20 seconds (an average of 20 executions of the test sequence by 16 different threads, producing 320 executions of each test sequence) will be considered success. Clearly, the test would be more conclusive if run on a machine with more than two cores.
Funding/Resources
Funding and developer resources to be provided by Autodesk.