Changes between Version 4 and Version 5 of FdoEnhancedSchemaNameSupport


Ignore:
Timestamp:
10/18/07 14:51:51 (17 years ago)
Author:
gregboone
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FdoEnhancedSchemaNameSupport

    v4 v5  
    1010The FDO API provides functions for translating FDO Feature Schemas to and from the !OpenGeospatial GML format. These functions are provided to satisfy 4 main use cases:
    1111
    12 • Export/Import: to provide a text based export format for FDO Feature Schemas. This export format actually covers other types of objects such as Spatial Contexts, Schema Overrides and Features. However, only Feature Schemas are pertinent to this document.
    13 
    14 • Schema Exchange: Allow exchange of schemas between FDO and external GML-based applications.
    15 
    16 • WFS Provider: used by the FDO WFS Provider to translated GML schemas, provided by the connected WFS, to FDO Schemas.
    17 
    18 • Publish as WFS: Allow FDO accessible data to be published via a WFS. This is the opposite of the previous use case.
     12        • Export/Import: to provide a text based export format for FDO Feature Schemas. This export format actually covers other types of objects such as Spatial Contexts, Schema Overrides and Features. However, only Feature Schemas are pertinent to this document.
     13
     14        • Schema Exchange: Allow exchange of schemas between FDO and external GML-based applications.
     15
     16        • WFS Provider: used by the FDO WFS Provider to translated GML schemas, provided by the connected WFS, to FDO Schemas.
     17
     18        • Publish as WFS: Allow FDO accessible data to be published via a WFS. This is the opposite of the previous use case.
    1919
    2020One of the big challenges, in translating schemas between FDO and GML, is the converting of schema names.  In order to support the above use cases, these conversions must satisfy the following general requirements:
    2121
    22 • round trip fidelity. If the schema is translated from GML to FDO to GML, the schema name in the resulting GML schema document must be the same as in the original. Similarly, the name must not change when translated from FDO to GML to FDO.
    23 
    24 • name uniqueness must be preserved. Different GML schemas must get different FDO schema names when read into FDO. Conversely different FDO schemas must get different GML schema names when written to GML. If name uniqueness is not preserved, schemas will be unexpectedly merged on read or write.
     22        • round trip fidelity. If the schema is translated from GML to FDO to GML, the schema name in the resulting GML schema document must be the same as in the original. Similarly, the name must not change when translated from FDO to GML to FDO.
     23
     24        • name uniqueness must be preserved. Different GML schemas must get different FDO schema names when read into FDO. Conversely different FDO schemas must get different GML schema names when written to GML. If name uniqueness is not preserved, schemas will be unexpectedly merged on read or write.
    2525
    2626The structure of schema names differs greatly in either format:
    2727
    28 • in FDO, a schema name is a free-form name, containing any character except '.' and ':'. Names tend to be short; more detailed information is typically kept in the schema description.
    29 
    30 • in GML, the schema name must be a valid URI. Most current FDO schema names are valid URI's. However, most GML schema names tend to conform to the http scheme (see glossary), as seen in the following example. FDO Schema names would tend to not fit the http scheme.
     28        • in FDO, a schema name is a free-form name, containing any character except '.' and ':'. Names tend to be short; more detailed information is typically kept in the schema description.
     29
     30        • in GML, the schema name must be a valid URI. Most current FDO schema names are valid URI's. However, most GML schema names tend to conform to the http scheme (see glossary), as seen in the following example. FDO Schema names would tend to not fit the http scheme.
    3131
    3232A typical example might be a Roads schema defined by the municipality "MyCity". The FDO schema might simply be "Roads". However, the GML schema name might look something like this:
    3333
     34        http://www.mycity.on.ca/departments/transportation/Roads
     35
     36where the schema name is qualified by the owning organization. This makes it difficult to perform the schema name conversion in a way that satisfies the abovementioned requirements.
     37
     38The FDO API provides a number of methods to ensure round trip fidelity and preservation of schema name uniqueness. However, these methods are cumbersome for some of the abovementioned use cases. This document looks at alternatives for making schema name translation easier when performed through the FDO API.
     39
     40=== Current API ===
     41
     42Feature Schema translation is provided by 2 functions on !FdoFeatureSchemaCollection:
     43
     44        • !ReadXml() converts GML schemas to FDO[[br]]
     45        • !WriteXml() converts FDO schemas to GML (!WriteXml() is also present on !FdoFeatureSchema to allow the writing of individual schemas).
     46
     47Both of the above functions take optional !FdoXmlFlags parameters, which control how the translation is performed.
     48
     49The following sub-sections look at the various schema name translation options currently provided:
     50
     51==== Default Translation ====
     52
     53===== FDO to GML =====
     54
     55When no !FdoXmlFlags are specified, the FDO schema name is translated by prepending a default osgeo-defined schema prefix (http://fdo.osgeo.org/schemas/feature/) to the schema name and escaping any characters not allowed in a URI. For example, the FDO Schema "Water Service" becomes:
     56
     57http://fdo.osgeo.org.schemas/feature/Water-x20-Service
     58
     59===== GML to FDO =====
     60
     61When no !FdoXmlFlags are specified, GML schema names are translated by dropping any http:// prefix and escaping '.' and ':' to '-dot-' and '-colon-' respectively. This means that the example schema name from the Overview:
     62
    3463http://www.mycity.on.ca/departments/transportation/Roads
    3564
    36 where the schema name is qualified by the owning organization. This makes it difficult to perform the schema name conversion in a way that satisfies the abovementioned requirements.
    37 
    38 The FDO API provides a number of methods to ensure round trip fidelity and preservation of schema name uniqueness. However, these methods are cumbersome for some of the abovementioned use cases. This document looks at alternatives for making schema name translation easier when performed through the FDO API.
    39 
    40 === Current API ===
    41 
    42 Feature Schema translation is provided by 2 functions on !FdoFeatureSchemaCollection:
    43 
    44 • !ReadXml() converts GML schemas to FDO[[br]]
    45 • !WriteXml() converts FDO schemas to GML (!WriteXml() is also present on !FdoFeatureSchema to allow the writing of individual schemas).
    46 
    47 Both of the above functions take optional !FdoXmlFlags parameters, which control how the translation is performed.
    48 
    49 The following sub-sections look at the various schema name translation options currently provided:
    50 
    51 ==== Default Translation ====
    52 
    53 ===== FDO to GML =====
    54 
    55 When no !FdoXmlFlags are specified, the FDO schema name is translated by prepending a default osgeo-defined schema prefix (http://fdo.osgeo.org/schemas/feature/) to the schema name and escaping any characters not allowed in a URI. For example, the FDO Schema "Water Service" becomes:
    56 
    57 http://fdo.osgeo.org.schemas/feature/Water-x20-Service
    58 
    59 ===== GML to FDO =====
    60 
    61 When no !FdoXmlFlags are specified, GML schema names are translated by dropping any http:// prefix and escaping '.' and ':' to '-dot-' and '-colon-' respectively. This means that the example schema name from the Overview:
    62 
    63 http://www.mycity.on.ca/departments/transportation/Roads
    64 
    6565becomes:
    6666
    67 www-dot-mycity-dot-on-dot-ca/departments/transportation/Roads
     67        www-dot-mycity-dot-on-dot-ca/departments/transportation/Roads
    6868
    6969This preserves name uniqueness but leads to a rather messy looking FDO schema name that is not easily human-readable.
     
    7171When the GML schema name begins with the default schema prefix (http://fdo.osgeo.org/schemas/feature/), this whole prefix is removed from the schema name. For example:
    7272
    73 http://fdo.osgeo.org/schemas/Roads
     73        http://fdo.osgeo.org/schemas/Roads
    7474
    7575simply becomes:
    7676
    77 Roads.
     77        Roads.
    7878
    7979This is done to preserve round-trip fidelity.
     
    9191There is also a defect which occurs when a schema name, that doesn't start with the default  prefix, is round tripped from GML to FDO to GML. The GML Schema name:
    9292
    93 http://www.mycity.on.ca/departments/transportation/Roads
     93        http://www.mycity.on.ca/departments/transportation/Roads
    9494
    9595becomes:
    9696
    97 www-dot-mycity-dot-on-dot-ca/departments/transportation/Roads
     97        www-dot-mycity-dot-on-dot-ca/departments/transportation/Roads
    9898
    9999in FDO. However, when written back to GML, the default prefix is still prepended to the schema name, giving:
    100100
    101 http: //fdo.osgeo.org/schemas/feature/www.mycity.on.ca/departments/transportation/Roads
     101        http: //fdo.osgeo.org/schemas/feature/www.mycity.on.ca/departments/transportation/Roads
    102102
    103103Therefore, round trip fidelity is not preserved.
     
    121121This method works better, for the Publish as WFS use case, than the default method. The customer can control what the GML schema name looks like and can make it reflect the URI's used by their organization. However, there is a limitation in that the same schema prefix gets applied to each schema in the schema collection. For example, if the schema contains a "Roads" and "!WaterService" schema, and the desired GML Schema names are:
    122122
    123         www.mycity.on.ca/departments/transportation/Roads
     123        www.mycity.on.ca/departments/transportation/Roads[[br]]
    124124        www.mycity.on.ca/departments/watersewer/WaterService
    125125
     
    146146The current API handles the Export/Import and WFS Provider uses cases fairly well. However, the handling of the Schema Exchange and Publish as WFS use cases is cumbersome for the following reasons:
    147147
    148 • customer must usually supply a schema  prefix and must ensure that consistent prefixes are used when reading and writing the feature schemas.
    149 
    150 • due to a bug, round trip fidelity of schema names is not preserved when the schema name does not start with the schema prefix.
    151 
    152 • For a single !ReadXml() or !WriteXml() operation, there is no way to apply a different schema prefix to each schema.
    153 
    154 • On !ReadXml(), if the GML schema name is not prefixed by the schema prefix, a messy looking FDO schema name is generated.
     148        • customer must usually supply a schema  prefix and must ensure that consistent prefixes are used when reading and writing the feature schemas.
     149
     150        • due to a bug, round trip fidelity of schema names is not preserved when the schema name does not start with the schema prefix.
     151
     152        • For a single !ReadXml() or !WriteXml() operation, there is no way to apply a different schema prefix to each schema.
     153
     154        • On !ReadXml(), if the GML schema name is not prefixed by the schema prefix, a messy looking FDO schema name is generated.
    155155
    156156The main requirement is to streamline the handling of these two use cases. !MapGuide currently (or plans to ) supports publishing FDO data sources through a WFS, so supporting this Publish as WFS likely takes priority over Schema Exchange. (TBD: verify !MapGuide's WFS publishing requirements).
     
    160160Section 'Solution Options' below explores a number of solution options. From these options, a number of recommendations can be made:
    161161
    162 • It is recommended that the  Schema Name Attributes option be implemented (see 5.2 Schema Name Attributes). Although not perfect, it is the best option mentioned in this document.
    163 
    164 • it is not currently recommended that we allow '.' and ':' in schema names (see 5.1 Simple Default Translation). However, this option would make schema names more readable so it should be explored further to see if the potential drawbacks can be addressed.
     162        • It is recommended that the  Schema Name Attributes option be implemented (see 5.2 Schema Name Attributes). Although not perfect, it is the best option mentioned in this document.
     163
     164        • it is not currently recommended that we allow '.' and ':' in schema names (see 5.1 Simple Default Translation). However, this option would make schema names more readable so it should be explored further to see if the potential drawbacks can be addressed.
    165165
    166166=== Solution Options ===
     
    172172Under this option, the url setting for !FdoXmlFlags would no longer have a default. When no !FdoXmlFlags are specified, !ReadXml() and !WriteXml() would not usually modify the schema name, meaning that the GML and FDO names would usually be identical.
    173173
    174 ''!WriteXml()'' would only modify the name if it is not a valid URI. In this case, the name would be adjusted by escaping any non-conforming characters.
    175 
    176 ''!ReadXml()'' would only modify the name if it contains any escaped characters, which it would unescape. !ReadXml() would no longer escape '.' and ':'. This has implications on the !FdoFeatureSchema object in that these two characters would have to be allowed in schema names. This can be done by removing the no '.' or ':' restriction from schema element names. However, we'd need a way to indicate when these are literal characters when they appear as qualified names. One possibility is to mandate that each component of a qualified schema element name must be enclosed in double quotes when it contains a '.' or ':'.
     174        ''!WriteXml()'' would only modify the name if it is not a valid URI. In this case, the name would be adjusted by escaping any non-conforming characters.
     175
     176        ''!ReadXml()'' would only modify the name if it contains any escaped characters, which it would unescape. !ReadXml() would no longer escape '.' and ':'. This has implications on the !FdoFeatureSchema object in that these two characters would have to be allowed in schema names. This can be done by removing the no '.' or ':' restriction from schema element names. However, we'd need a way to indicate when these are literal characters when they appear as qualified names. One possibility is to mandate that each component of a qualified schema element name must be enclosed in double quotes when it contains a '.' or ':'.
    177177
    178178When the url !FdoXmlFlag is specified, the behaviour would be as now: !WriteXml() would tack this url prefix onto the schema name and !ReadXml() would remove it if present.
     
    180180'''Pros'''
    181181
    182 • Simpler approach. For the default case, we don't need to worry about tacking on and stripping off prefixes.
    183 
    184 • FDO Schema Names derived from GML are more readable. The names can still be long but at least they won't contain the '-dot-' and '-colon-' sequences.
    185 
    186 • Works well for the Export/Import and Schema Exchange use cases as long as both !ReadXml() and !WriteXml() are done in FDO 3.3.0. Not applicable to the WFS Provider use case.
     182        • Simpler approach. For the default case, we don't need to worry about tacking on and stripping off prefixes.
     183
     184        • FDO Schema Names derived from GML are more readable. The names can still be long but at least they won't contain the '-dot-' and '-colon-' sequences.
     185
     186        • Works well for the Export/Import and Schema Exchange use cases as long as both !ReadXml() and !WriteXml() are done in FDO 3.3.0. Not applicable to the WFS Provider use case.
    187187
    188188'''Cons'''
    189189
    190 • Backward Compatibility. For the Export/Import and Schema Exchange use cases, schema name round trip fidelity is not preserved if !WriteXml() is done from pre-Slate and !ReadXml() from FDO 3.3.0 or vice versa. The reason is that one operation will use a different name translation method from the other. This is more of a problem for Export/Import. The bug mentioned above already introduces round trip fidelity problems for the Schema Exchange use case. The backward compatibility issues could be mitigated if we could detect when one of the operations was done using pre-slate. In this case the operation done in slate would use the pre-slate name translation rules. However, this eliminates the simplicity pro since we still have to keep the old name translation algorithms around.
    191 
    192 • Provider compatibility. Some providers might need changes, since current code based on assumption that schema element names do not contain '.' or ':'.
    193 
    194 • Qualified Schema Element name compatibility. This option changes the rules for constructing qualified element names. In pre-slate, double quotes are always literals but in Slate they would be delimiters. It is unlikely that any pre-existing schema names start and end with a double quote but it is possible.
    195 
    196 • GML Schema names no longer guaranteed to follow the http scheme. Almost every GML schema we've seen sofar follows this URI naming scheme, so customers might complain if we don't always follow it. Therefore, this option does not work well for the Publish as WFS use case. Clients doing this sort of publishing will likely start specifying FdoXmlFlags to WriteXml to ensure that the GML Schema name follows the http scheme.
     190        • Backward Compatibility. For the Export/Import and Schema Exchange use cases, schema name round trip fidelity is not preserved if !WriteXml() is done from pre-Slate and !ReadXml() from FDO 3.3.0 or vice versa. The reason is that one operation will use a different name translation method from the other. This is more of a problem for Export/Import. The bug mentioned above already introduces round trip fidelity problems for the Schema Exchange use case. The backward compatibility issues could be mitigated if we could detect when one of the operations was done using pre-slate. In this case the operation done in slate would use the pre-slate name translation rules. However, this eliminates the simplicity pro since we still have to keep the old name translation algorithms around.
     191
     192        • Provider compatibility. Some providers might need changes, since current code based on assumption that schema element names do not contain '.' or ':'.
     193
     194        • Qualified Schema Element name compatibility. This option changes the rules for constructing qualified element names. In pre-slate, double quotes are always literals but in Slate they would be delimiters. It is unlikely that any pre-existing schema names start and end with a double quote but it is possible.
     195
     196        • GML Schema names no longer guaranteed to follow the http scheme. Almost every GML schema we've seen sofar follows this URI naming scheme, so customers might complain if we don't always follow it. Therefore, this option does not work well for the Publish as WFS use case. Clients doing this sort of publishing will likely start specifying FdoXmlFlags to WriteXml to ensure that the GML Schema name follows the http scheme.
    197197
    198198Conclusions:
    199199
    200 • If we were starting from scratch, this would be a good option. However, at this stage, it is not really viable, due to backward compatibility issues, since these issues wipe out the advantages in simplicity.
     200        • If we were starting from scratch, this would be a good option. However, at this stage, it is not really viable, due to backward compatibility issues, since these issues wipe out the advantages in simplicity.
    201201
    202202==== Schema Name Attributes ====
     
    206206An xs:schema/fdo:name attribute would be added to the  FDO XML format, to specify the equivalent FDO name for schema.
    207207
    208 !WriteXml() would write this attribute to the GML document.
    209 
    210 when present, !ReadXml() would take this attribute as the FDO schema name. Otherwise, the FDO Schema name would be generated from the GML name as is currently done.
     208        !WriteXml() would write this attribute to the GML document.
     209
     210        when present, !ReadXml() would take this attribute as the FDO schema name. Otherwise, the FDO Schema name would be generated from the GML name as is currently done.
    211211
    212212A globalName attribute would be added to the !FdoFeatureSchema class:
    213213
    214 !ReadXml() would set this attribute.
    215 
    216 when present, !WriteXml() would use this attribute as the GML schema name. Otherwise, the GML schema name would be generated from the FDO schema name, as is currently done. Hopefully, the !GlobalName would be set to a valid URI. If not then !WriteXml() would escape any non-conforming characters. Alternatively, we could restrict the !GlobalName to be a valid URI.
     214        !ReadXml() would set this attribute.
     215
     216        when present, !WriteXml() would use this attribute as the GML schema name. Otherwise, the GML schema name would be generated from the FDO schema name, as is currently done. Hopefully, the !GlobalName would be set to a valid URI. If not then !WriteXml() would escape any non-conforming characters. Alternatively, we could restrict the !GlobalName to be a valid URI.
    217217
    218218From a semantic standpoint, the name attribute for !FdoFeatureSchema would be unique within a particular domain (e.g.: an FDO Datastore, an !FdoFeatureSchemaCollection in an application). The !GlobalName attribute would be intended to be universally unique, or least unique among all organizations that use the feature schema.
     
    220220Pros:
    221221
    222 • Good for Schema Exchange  use case. Customer no longer needs to ensure the same url !FdoXmlFlag is used for both !WriteXml() and !ReadXml(). For the Schema Exchange, case it also opens up the possibility of using the Short Prefix as Schema Name method, since it eliminates the problem of reconstituting the GML schema name on !WriteXml().
    223 
    224 • Flexible. The GML and FDO schema name correspondences can be set on a per-schema basis.
    225 
    226 • Still a relatively simple option. Much simpler than the Schema Overrides and Schema Namespaces options discussed below.
    227 
    228 • Good for Publish as WFS use cases. Publisher can set the GML name for each FDO schema explicitly.
    229 
    230 • GML schema name can be persisted in an FDO Datastore.
    231 
    232 • Alternatively, GML schema name can be set on the Feature Schema just before it is converted to GML. This can easily be done via the FDO API.
     222        • Good for Schema Exchange  use case. Customer no longer needs to ensure the same url !FdoXmlFlag is used for both !WriteXml() and !ReadXml(). For the Schema Exchange, case it also opens up the possibility of using the Short Prefix as Schema Name method, since it eliminates the problem of reconstituting the GML schema name on !WriteXml().
     223
     224        • Flexible. The GML and FDO schema name correspondences can be set on a per-schema basis.
     225
     226        • Still a relatively simple option. Much simpler than the Schema Overrides and Schema Namespaces options discussed below.
     227
     228        • Good for Publish as WFS use cases. Publisher can set the GML name for each FDO schema explicitly.
     229
     230        • GML schema name can be persisted in an FDO Datastore.
     231
     232        • Alternatively, GML schema name can be set on the Feature Schema just before it is converted to GML. This can easily be done via the FDO API.
    233233
    234234Cons:
    235235
    236 • There is no guarantee that these new settings will be persisted. This limits the cases where these settings can be effective. If an FDO schema goes through the following steps:
    237 
    238 • Write to GML[[br]]
    239 • Read GML into 3rd party application[[br]]
    240 • Write from 3rd party application to GML[[br]]
    241 • Read from GML[[br]]
    242 
    243 the fdo:schema name will likely be lost when the schema goes through the 3rd party application.
    244 
    245 FDO providers would not necessarily be immediately modified to support the GML namespace !FdoFeatureSchema attribute. Therefore, the following steps:
    246 
    247 • Read from GML[[br]]
    248 • Apply schema to FDO datastore[[br]]
    249 • Describe schema back from datastore[[br]]
    250 • Write schema to GML[[br]]
    251 
    252 will lose the GML Schema name if the datastore's provider does not handle it. However, this particular problem can be mitigated by ensuring that the SDF and RDBMS providers support this attribute.
    253 
    254 This con would be applicable to the Schema Exchange and Publish as WFS use cases. Export/Import and WFS Provider would be unaffected.
    255 
    256 • For cases where the FDO schema name cannot be persisted in the GML document, it would be possible for the application to add it to the document just before it is read into FDO. This is possible to do but not quite straightforward since there is no simple FDO API to do this. It would have to be done via the Xerces DOM classes or by an XSL transformation.
    257 
    258 • There is overlap between the globalName attribute and the targetNamespace attribute on !FdoSchemaMapping (See next section), since both would represent the GML schema name. We could have a precedence rule (e.g. !FdoSchemaMapping.targetNamespace trumps !FdoFeatureSchema.globalName). However, having multiple places where the same attribute can be set makes the FDO API more complicated.
     236        • There is no guarantee that these new settings will be persisted. This limits the cases where these settings can be effective. If an FDO schema goes through the following steps:
     237
     238                • Write to GML[[br]]
     239                • Read GML into 3rd party application[[br]]
     240                • Write from 3rd party application to GML[[br]]
     241                • Read from GML[[br]]
     242
     243        the fdo:schema name will likely be lost when the schema goes through the 3rd party application.
     244
     245        FDO providers would not necessarily be immediately modified to support the GML namespace !FdoFeatureSchema attribute. Therefore, the following steps:
     246
     247                • Read from GML[[br]]
     248                • Apply schema to FDO datastore[[br]]
     249                • Describe schema back from datastore[[br]]
     250                • Write schema to GML[[br]]
     251
     252        will lose the GML Schema name if the datastore's provider does not handle it. However, this particular problem can be mitigated by ensuring that the SDF and RDBMS providers support this attribute.
     253
     254        This con would be applicable to the Schema Exchange and Publish as WFS use cases. Export/Import and WFS Provider would be unaffected.
     255
     256        • For cases where the FDO schema name cannot be persisted in the GML document, it would be possible for the application to add it to the document just before it is read into FDO. This is possible to do but not quite straightforward since there is no simple FDO API to do this. It would have to be done via the Xerces DOM classes or by an XSL transformation.
     257
     258        • There is overlap between the globalName attribute and the targetNamespace attribute on !FdoSchemaMapping (See next section), since both would represent the GML schema name. We could have a precedence rule (e.g. !FdoSchemaMapping.targetNamespace trumps !FdoFeatureSchema.globalName). However, having multiple places where the same attribute can be set makes the FDO API more complicated.
    259259
    260260Conclusions
    261261
    262 • This is a viable option. Although it doesn't help with Schema name translation in all cases, it would still handle a lot of cases
    263 
    264 • Despite the overlap with GML schema overrides, this would currently be the recommended option, since it is the best one mentioned in this document.
     262        • This is a viable option. Although it doesn't help with Schema name translation in all cases, it would still handle a lot of cases
     263
     264        • Despite the overlap with GML schema overrides, this would currently be the recommended option, since it is the best one mentioned in this document.