wiki:FDORfc50

Version 26 (modified by oresthalustchak, 14 years ago) ( diff )

--

FDO RFC 50 - Extend FDO API to Select from Multiple Classes and Select Joins

This page contains a request for comments document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the RFCs page.

Status

RFC Template Version1.0
Submission DateJun. 02 2010
Last ModifiedRomica Dascalescu Timestamp
AuthorRomica Dascalescu
ContibutorsOrest Halustchak, Greg Boone
RFC StatusDraft
Implementation StatusPending
Proposed Milestone3.6.0.0
Assigned PSC guide(s)Greg Boone
Voting History(vote date)
+1
+0
-0
-1

Motivation

It is a common requirement for applications to join classes together so that they can process related information that is spread across more than one class, e.g. properties stored in one class and geometry in another class. This can be for modeling purposes such as when geometry in one class is shared between separate feature classes or when properties from another class are needed for rendering purposes. Currently this is achieved either by using client-side join processing as in MapGuide or for RDBMS sources, by defining views as the physical level and exposing them as feature classes at the FDO API level. While these techniques are useful, the first results in very poor performance compared to a native join if the classes come from the same data store and the data store can do native joins as is possible with RDBMS-based providers. Using views is common as well, but requires the user to do this at the physical data store level. Finally, users could use SQL pass-through for providers that support it but that has issues because again the user is dealing with physical schema rather than FDO classes and the result is not a feature class.

This RFC adds the ability to create joined selects using the FDO API directly rather than using the workarounds described above. This adds more power and flexibility to the FDO API for applications that need that capability. This enhancement will be optional for providers and a capabilities function is included for an application to determine if a provider supports it.

Overview

Implementing this new select is optional for all providers, for now we will start with the SQLite provider and in the future we can extend it to other providers as needed. We expect the RDBMS-based providers to support this over time.

The idea is to add support in FDO API for selecting from multiple classes and get a feature reader back.

We have a few options from which we can chose in order to achieve our goal:

  • Add a new select command e.g. FdoIJoinSelect derived from FdoISelect, add a new command type and caller needs to ensure that command exist in the provider command capabilities before calling this interface. A small improvement here we can add two classes FdoIJoinSelect (the interface having only pure methods) and FdoJoinSelect (having a default implementation of those methods). All providers can use the class with a base implementation and in the future in case we want to add a new method we just add a default implementation in FdoJoinSelect, however this adds more complexity to the API since developer would be confused which class should use as base class in his implementation.
    • Pros: will not change the FdoISelect command, all other providers do not have to change anything on their side, and only in case a provider wants to support this command it has to implement it and expose it as supported.
    • Cons: We add a new command and a new type command, we already have a FdoIExtendedSelect for scrollable readers and users can be confused by those two commands. This will add more complexity in the API.
  • Enhance FdoISelect, add two new methods GetFeatureClassNames() and GetJoinCriteria(). Create an FdoSelect base class and add default implementers for these methods. Then each provider would be changed to base their select command implementations on FdoSelect instead of FdoISelect. To allow the join functionality to be fully integrated, two new capability functions will be added to the FDO FdoIConnectionCapabilities Interface so that applications can determine if a provider supports the join criteria and supported join types. These new capabilites would be named: SupportsJoins() & GetJoinTypes(). Again, if future enhancements are ever made to FdoISelect, the need to change each provider would be eliminated.
    • Pros: this solution will not add more complexity to the API.
    • Cons: We need to change all the providers leading to a big effort from everyone.
  • Enhance FdoISelect, add two new methods GetFeatureClassNames() and GetJoinCriteria() with the default implementation of these methods throwing an exception. Providers which want to implement these new methods must override them and provide a detailed implementation. Even though FdoISelect is declared as interface we do not have real interfaces in the FDO C++ API, and abstract methods can co-exist with those that have a default implementation. A good example here are the locking methods on the select command. Most providers do not support locking and have to provide an empty implementation (usually to throw an exception), creating more code on provider side. A better plan would have provided a default implementation in the FdoISelect base class. Following this idea, this alternative will just modify the FdoISelect base class, allowing providers that that will not support joins to remain unchanged. These providers will inherit the default implementation. Looking at FdoISelect there are already two methods which provide default implementation: AddRef() and Release(), so adding default implementation for the new methods will not be something totally new. As with option 2 above: Two new capability functions will be added to the FDO FdoIConnectionCapabilities Interface so that applications can determine if a provider supports the join criteria and supported join types. These new capabilites would be named: SupportsJoins() & GetJoinTypes().
    • Pros: this solution will not add more complexity to the API, we do not need to change any provider.
    • Cons: None.

Proposed Solution

The last option seems to be the best solution since will not enforce other providers to add a blank implementation for those new methods and also will not add a new command. We avoid having two select commands one "extended" and one "join" this way we also eliminate a confusion which might show-up in commands naming. Also it opens a different approach to update a base class without changing all providers to add a blank implementation.

The new methods added on FdoISelect will have a default implementation throwing an exception:

  class FdoISelect : public FdoIBaseSelect
  {
     /// new added methods

     /// \brief
     /// Gets the feature class names collection.
     /// 
     /// \return
     /// Returns the collection.
     /// 
     FDO_API virtual FdoIdentifierCollection* GetFeatureClassNames()
     {
          throw FdoException::Create(FdoException::NLSGetMessage(FDO_NLSID(FDO_3_NOTIMPLEMENTED)));
     }

     /// \brief
     /// Gets the join criteria collection.
     /// 
     /// \return
     /// Returns the collection.
     /// 
     FDO_API virtual FdoJoinCriteriaCollection* GetJoinCriteria()
     {
          throw FdoException::Create(FdoException::NLSGetMessage(FDO_NLSID(FDO_3_NOTIMPLEMENTED)));
     }
  }

Selecting form Multiple Classes

We should be able to select from multiple classes and have filters applied on any class attributes. In order to be able to achieve that objective, we can use FdoISelect::GetFeatureClassNames() and add all classes we want to select from. The filter must be based on properties from the selected classes and filter can be validated at run time when the whole select is built. In this collection we can have:

  • FdoIdentifier’s in case we just need to pass a class name.
  • FdoComputedIdentifier’s having a name and as expression value an FdoIdentifier, this way we can get aliases, e.g. ClassName AS NewClassName. We already use computed identifiers in our API for expressions and we provide a name and an expression value. In this case it will be the same, since an alias is actually an expression: we provide an alias (name) and the real name of the class.
  • FdoClassComputedIdentifier’s which is a FdoISelect having a name. This option might not be supported by all RDBMS providers and it depends of the internal implementation. This class behaves like a computed class which will be evaluated at run time and really depends of the provider implementation. This new class will help to define sub-selects. Refer to Handle Sub-Selects below.

The following sections will use some SQL examples just to illustrate things we are trying to achieve, each RBDMS-based provider will translate FDO selects in server side SQL depending of the server capabilities.

SELECT * FROM class1 AS p1, class2 AS p2 WHERE p1.ID=p2.FeatId AND SPATIAL_COND(p1.GEOM, GEOM_VAL);

Below we added some C++ code on how caller can use this new improvement to achieve above select statement:

      FdoPtr<FdoISelect> sel = static_cast<FdoISelect*>(conn->CreateCommand(FdoCommandType_Select));
      FdoPtr<FdoIdentifierCollection> fclasscoll = = sel->GetFeatureClassNames();

      FdoPtr<FdoIdentifier> fcname1 = FdoIdentifier::Create(L"class1");
      FdoPtr<FdoComputedIdentifier> fcp1 = FdoComputedIdentifier::Create(L"p1", fcname1);
      fclasscoll->Add(fcp1);

      FdoPtr<FdoIdentifier> fcname2 = FdoIdentifier::Create(L"class2");
      FdoPtr<FdoComputedIdentifier> fcp2 = FdoComputedIdentifier::Create(L"p2", fcname2);
      fclasscoll->Add(fcp2);

      FdoPtr<FdoFilter> filter = FdoFilter::Parse(L"p1.ID=p2.FeatId AND Geometry INTERSECTS GeomFromText('CURVESTRING(...)')");
      sel->SetFilter(filter);

      FdoPtr<FdoIFeatureReader> rdr = sel->Execute();

NOTE: In the base class we have a method named GetFeatureClassName, in case caller will use it and the FdoIdentifier is not added into GetFeatureClassNames we can add it at Execute time.

Handling Joins

Below we define new classes to define and use a join criteria used in the join select.

      class FdoJoinCriteria : public FdoIDisposable
      {
      public:
          /// Constructs a default instance of a join criteria.
          FDO_API static FdoJoinCriteria* Create();

          /// Constructs a full defined instance of a join criteria.
          FDO_API static FdoJoinCriteria* Create(FdoString* name, 
                                                 FdoIdentifier* joinClass, 
                                                 FdoJoinType joinType, 
                                                 FdoFilter* filter);

          /// Returns the alias name for the join criteria class
          FDO_API FdoString* GetName();

          /// Sets the alias name for the join criteria class
          FDO_API void SetName(FdoString* value);

          /// Returns the identifier of the right join class
          FDO_API FdoIdentifier* GetJoinClass();

          /// Sets the identifier of the right join class
          FDO_API void SetJoinClass(FdoIdentifier* value);

          /// Returns the join type, e.g. inner join, cross join, left join
          FDO_API FdoJoinType GetJoinType();

          /// Sets the join type, e.g. inner join, cross join, left join
          FDO_API void SetJoinType(FdoJoinType value);

          /// Returns the join filter, e.g. LEFT JOIN rclass ON(filter)
          FDO_API FdoFilter* GetFilter();

          /// Sets the join filter, e.g. LEFT JOIN rclass ON(filter)
          FDO_API void SetFilter(FdoFilter* value);
      };

      /// Join type enumeration used to specify the join type
      enum FdoJoinType
      {
          /// default join type. 
          FdoJoinType_None = 0x01,
          /// inner join
          FdoJoinType_Inner = 0x02,
          /// right outer join
          FdoJoinType_RightOuter = 0x04,
          /// left outer join
          FdoJoinType_LeftOuter = 0x08,
          /// full outer join
          FdoJoinType_FullOuter = 0x10,
          /// cross join
          FdoJoinType_Cross = 0x20
      };

We need to add two new methods in connection capabilities: supports join selects (to avoid an exception when a user will try to use these two new methods from FdoISelect) and also supported join types which can be a collection or a value as OR between values, since all join types cannot be more than we can fit in a int32 using OR e.g.: val1|val2.

      class FdoIConnectionCapabilities
      {
          virtual bool SupportsJoins();
          virtual FdoInt32 GetJoinTypes();
      }

We also need a collection to be defined for join criteria:

      class FdoJoinCriteriaCollection : public FdoCollection<FdoJoinCriteria, FdoCommandException>
      {
      public:
          /// Default constructor
          FDO_API static FdoJoinCriteriaCollection* Create();

          /// Returns FdoJoinCriteria having the name. 
          /// This will throw an exception in case item is not found
          FDO_API FdoJoinCriteria* GetItem(FdoString* name);
        
          /// Returns FdoJoinCriteria having the name or NULL. 
          FDO_API FdoJoinCriteria* FindItem(FdoString* name);

          /// Returns FdoJoinCriteria having the index. 
          FDO_API FdoJoinCriteria* GetItem(FdoInt32 index);
      };

Join criteria will be used to generate join SQL commands, like:

SELECT p.FeatId AS xFID,p.Geometry,pxy.Name,pxy.X,pxy.Y,pex.NameB,pex.Link FROM p
INNER JOIN pxy ON(p.FeatId = pxy.FeatId)
INNER JOIN pex ON(pex.FID = pxy.FeatId)
WHERE p.FeatId >= 0;

Below we added some C++ code on how caller can use this new improvement to achieve above select statement:

      FdoPtr<FdoISelect> sel = static_cast<FdoISelect*>(conn->CreateCommand(FdoCommandType_Select));
      FdoPtr<FdoIdentifierCollection> fclasscoll = = sel->GetFeatureClassNames();
      FdoPtr<FdoJoinCriteriaCollection> jcrit = sel->GetJoinCriteria();

      FdoPtr<FdoIdentifier> fcpoint = FdoIdentifier::Create(L"p");
      fclasscoll->Add(fcpoint);

      FdoPtr<FdoIdentifier> jcCls1 = FdoIdentifier::Create(L"pxy");
      FdoPtr<FdoFilter> flcr1 = FdoFilter::Parse(L"p.FeatId=pxy.FeatId");

      FdoPtr<FdoJoinCriteria> jc1 = FdoJoinCriteria::Create(L"pxy", jcCls1, FdoJoinType_Inner, flcr1);
      jcrits->Add(jc1);

      FdoPtr<FdoIdentifier> jcCls2 = FdoIdentifier::Create(L"pex");
      FdoPtr<FdoFilter> flcr2 = FdoFilter::Parse(L"pex.FID=pxy.FeatId");

      FdoPtr<FdoJoinCriteria> jc2 = FdoJoinCriteria::Create(L"pex", jcCls2, FdoJoinType_Inner, flcr2);
      jcrits->Add(jc2);

      FdoPtr<FdoIFeatureReader> rdr = sel->Execute();

Handle Sub-Selects

Below we define a new class to define sub-select used in the new select.

      class FdoComputedClassIdentifier : public FdoIdentifier
      {
      public:
          /// Constructs a default instance of a computed class identifier.
          FDO_API static FdoComputedClassIdentifier* Create();

          /// Constructs an instance of a computed class identifier 
          /// using the specified arguments.
          FDO_API static FdoComputedClassIdentifier* Create(
                                         FdoString* name, 
                                         FdoISelect* select);

          /// Gets the sub-select of the computed class identifier.
          FDO_API FdoISelect* GetSelect();

          /// Sets the sub-select of the computed class identifier.
          FDO_API void SetSelect(FdoISelect* value);

          /// Returns the type of the expression used to identify the
          /// type and avoid dynamic casts
          FDO_API FdoItemExpressionType GetExpressionType() 
          { 
              return FdoItemExpressionType_ComputedClassIdentifier; 
          }
      };

Named sub-select will be used to generate SQL commands, like:

SELECT * FROM point WHERE FeatId IN (SELECT pt.FeatId FROM pt WHERE pt.FeatId >= 2 AND SPATIAL_COND(pt.GEOM, GEOM_VAL));

In above case we will have two selects having following filters:

Main-Select: FeatId IN(:SUBSEL)
Sub-Select : pt.FeatId >= 2 AND SPATIAL_COND(pt.GEOM, GEOM_VAL)

Below we added some C++ code on how caller can use this new improvement to achieve above select statement:

      FdoPtr<FdoISelect> sel = static_cast<FdoISelect*>(conn->CreateCommand(FdoCommandType_Select));
      FdoPtr<FdoIdentifierCollection> fclasscoll = = sel->GetFeatureClassNames();

      FdoPtr<FdoIdentifier> fcpoint = FdoIdentifier::Create(L"point");
      fclasscoll->Add(fcpoint);

      FdoPtr<FdoISelect> subsel = static_cast<FdoISelect*>(conn->CreateCommand(FdoCommandType_Select));
      FdoPtr<FdoIdentifierCollection> ssfclasscoll = = subsel->GetFeatureClassNames();
      FdoPtr<FdoIdentifier> fcpt = FdoIdentifier::Create(L"pt");
      ssfclasscoll->Add(fcpt);
      FdoPtr<FdoFilter> ssfilter = FdoFilter::Parse(L"pt.FeatId>=2 AND Geometry INTERSECTS GeomFromText('CURVESTRING(...)')");
      subsel->SetFilter(ssfilter);

      FdoPtr<FdoComputedClassIdentifier> fcSubSel = FdoComputedClassIdentifier::Create(L"SUBSEL1", subsel);
      fclasscoll->Add(fcSubSel);

      FdoPtr<FdoFilter> filter = FdoFilter::Parse(L"FeatId IN (:SUBSEL1)");
      sel->SetFilter(filter);

      FdoPtr<FdoIFeatureReader> rdr = sel->Execute();

In order to be able to support sub-selects for IN operator we need provide a way to pass it through FDO Filter and make the parser to recognize the filter, since FDO Filter parser don’t know SQL syntax. The best way to achieve that is to pass a parameter in the filter.

e.g.: FeatId IN (:PARAM).

The :PARAM parameter will be replaced at execution time with the sub-select statement. In this way we allow the FDO IN() operator to handle sub-selects.

As new additions, the FDO API will define three new classes:

  • FdoJoinCriteria
  • FdoJoinCriteriaCollection
  • FdoComputedClassIdentifier

and two enumerations

  • FdoItemExpressionType
  • FdoJoinType

The API will also be updated to include two new command capabilities in order to be able to detect if those methods are supported by the provider. Otherwise caller might get exceptions from the default implementation.

  • SupportsJoins
  • GetJoinTypes

At FdoISelect::Execute time a feature reader will be returned, even though there might be cases when the join class returned by the reader will not contain a primary property or a geometry property. In cases where select commands do not provide a property selection list, all properties should be added to the feature class reader built at runtime. In cases where applications request a subset of properties, only those properties should be added to the feature class, even if that means the returned feature class might then contain no identity properties or geometry properties. In the worst case, a provider implementation can add all identity properties from the selected classes to the resulting identity property list of the joined class definition. Also, in case there is no way to return an identity property, providers can add a non-identity property to the identity property collection of the returned class as a shortcut to allow a primary property to be defined. (NOTE that the returned class will be read-only since there is no way to know if the resulting class can be updated.)

FdoISelectAggregates interface will be updated in the same manner in order to allow performing aggregates selects having sub-selects in the filter side.

Update and delete commands will not be changed to allow such complex filters, however we can update those commands doing a similar RFC in the future if is needed.

Small improvements in FDO

Handling FDO expressions to avoid dynamic casts

In the current API in order to make a difference between FdoIdentifier and FdoComputedIdentifier we need to use dynamic cast and that’s not too fast. There is a simple way to avoid many dynamic casts by adding a new abstract method in FdoExpression named GetExpressionType(). All main derivate classes from FdoExpression need to return a type without adding a new parameter this way we avoid increasing the memory consumption by FDO expression objects. A new enumeration will be added, see below, and classes like FdoIdentifier will have to implement this method.

      /// The FdoExpressionType enumeration defines the expression types available.
      enum FdoItemExpressionType
      {
         /// Identifier e.g. PROP
         FdoItemExpressionType_Identifier,
         /// Identifier e.g. PROP*5 AS EXP
         FdoItemExpressionType_ComputedIdentifier,
         /// Identifier e.g. SELECT * FROM C1 AS CLS
         FdoItemExpressionType_ComputedClassIdentifier,
         /// Parameter e.g. :PROP
         FdoItemExpressionType_Parameter,
         /// Function e.g. AVG()
         FdoItemExpressionType_Function,
         /// Expression e.g. A+B
         FdoItemExpressionType_BinaryExpression,
         /// Expression e.g. -A
         FdoItemExpressionType_UnaryExpression,
         /// Data value e.g. 12, 12.4 'text'
         FdoItemExpressionType_DataValue,
         /// Geometry e.g. 12, 12.4 'text'
         FdoItemExpressionType_GeometryValue
      };

      class FdoIdentifier : public FdoExpression
      {
          /// Returns FdoItemExpressionType
          FDO_API virtual FdoItemExpressionType GetExpressionType() 
          { 
              return FdoItemExpressionType_Identifier; 
          }
      }

Update the FDO Filter to handle custom names

FDO filter parser needs to be updated since we support only Class.Property e.g.: "A.BB >=10". We need to support also formats like:

  • Class."Property"
  • "Class".Property
  • "Class"."Property"

This change is required in order to be able to support properies names having special characters, like: myClass."My Property" > 10. This change will not invlove any API change and will be done like a bug fix.

Managed FDO API

The FDO Managed Interfaces will be updated in a similar manner to reflect the proposed changes.

Provider Implementation

Initially, we need this for SQLite provider. Other provider developers can add it and use it as needed. The providers which decide to use this new select need to implement the new methods from the FdoISelect and return the right command capabilities.

Test Plan

Run existing FDO core unit tests and expand SQLite unit tests to test the proposed enhancements defined above.

Future Enhancements

If needed, similar extensions can be done to FDO Update, Delete, and other commands that use filters so that they can take advantage of sub-selects.

Other providers can be enhanced over time to implement these changes as needed.

Funding/Resources

Autodesk to provide resources / funding.

Note: See TracWiki for help on using the wiki.