25 | | |
26 | | |
27 | | == Motivation == |
28 | | |
29 | | |
30 | | |
31 | | == Proposed Solution == |
32 | | |
33 | | |
34 | | |
35 | | == Implications == |
36 | | |
37 | | |
38 | | |
39 | | == Test Plan == |
40 | | |
41 | | |
| 25 | This RFC proposes a convience wrapper API for FDO. The basic features of the API are: |
| 26 | |
| 27 | * Mostly procedural API (unlike FDO Core object oriented, command based structure). |
| 28 | This significantly reduces object lifetime issues, removes the need of refcounting |
| 29 | and eases possible managed wrapper, due to only one single object controlling database connection lifetime. |
| 30 | * Provides shortcuts for commonly used functionality, like spatial query, fetching extents, feature count |
| 31 | * Automatic type conversion (a.k.a duck typing). Unlike core FDO, the API converts automatically between comaptible type |
| 32 | (No more need to switch-case on FdoByte, FdoInt16, FdoInt32, FdoInt64, FdoDecimal, just to get an integer). |
| 33 | * Thread safe connection pooling/caching. This is a common feature that often developers have to implement |
| 34 | from scratch. |
| 35 | * Possibility to provide alternative backend to the API (for example, OGR backend). |
| 36 | * Possibility to implement common features, like coordinate system transformations, |
| 37 | in the wrapper layer, in a single common piece of code for all provider backends. |
| 38 | * Automatic backend data source resolution, similar to OGR |
| 39 | (i.e. if the connection string is a path to an SHP file, the SHP provider will automatically be loaded and used) |
| 40 | |
| 41 | |
| 42 | == High Level Architecture == |
| 43 | |
| 44 | There are four main objects in the API, reflecting the database-oriented nature of most FDO providers: |
| 45 | * Database (top level object) |
| 46 | * Table (a Database can have several Tables) |
| 47 | * Row (a Table consists of many Rows) |
| 48 | * Value (a Row consists of many column Values) |
| 49 | |
| 50 | The Database is the top level object, whose lifetime controls the lifetime of all other objects. |
| 51 | The API user only directly has to manage the lifetime of the Database object, which can be obtained from |
| 52 | and released to the thread safe connection pool automatically using a RAII-style DbHandle object, |
| 53 | to completely avoid manual object disposal. |
| 54 | |
| 55 | == Examples == |
| 56 | |
| 57 | The detailed API is available in header files attached to this document. Here are some examples |
| 58 | which illustrate the basics. |
| 59 | |
| 60 | === Example 1: Read and print the contents of a data source === |
| 61 | |
| 62 | This code sample iterates over all features of all tables of a database |
| 63 | and prints out the value for each column. Note that automatic data type conversion |
| 64 | to string and the simple access by index for each column (via operator[]). |
| 65 | Column value access by name is also possible using operator[]. |
| 66 | |
| 67 | void PrintFile(const wchar_t* path) |
| 68 | { |
| 69 | printf("Data source: %ls \n", path); |
| 70 | |
| 71 | // Get a connection to from the pool |
| 72 | DbHandle srcdb = DbPool::GetConnection(path); |
| 73 | |
| 74 | if (!srcdb) |
| 75 | return; |
| 76 | |
| 77 | //for all feature classes |
| 78 | for (int q=0; q<srcdb->Count(); q++) |
| 79 | { |
| 80 | Table& srctbl = (*srcdb)[q]; |
| 81 | |
| 82 | //table name |
| 83 | printf("Table: %ls\n", srctbl.Def().Name()); |
| 84 | |
| 85 | //show the overall extent |
| 86 | double ext[4]; |
| 87 | srctbl.GetExtent(ext); |
| 88 | printf("Bounds: %.8g, %.8g, %.8g, %.8g\n", ext[0], ext[1], ext[2], ext[3]); |
| 89 | |
| 90 | //number of features |
| 91 | printf("Total feature count: %lld\n", srctbl.GetRowCount()); |
| 92 | |
| 93 | //no spatial reordering -- just run through a full table select |
| 94 | srctbl.Query(); |
| 95 | |
| 96 | while (srctbl.Next()) |
| 97 | { |
| 98 | const Row& r = srctbl.At(); |
| 99 | |
| 100 | printf("\tFeature: %lld\n", r.ID()); |
| 101 | |
| 102 | for (int j=0; j<r.Def().ColumnCount(); j++) |
| 103 | { |
| 104 | //skip the fid which we printed above |
| 105 | if (j == r.Def().IndexOfID()) |
| 106 | continue; |
| 107 | |
| 108 | printf("\t\t%ls :\t%ls\n", r.Def()[j].Name(), r[j].AsString()); |
| 109 | } |
| 110 | } |
| 111 | |
| 112 | srctbl.EndQuery(); //not strictly needed |
| 113 | } |
| 114 | } |
| 115 | |
| 116 | === Example 2: Bulk copy from one data source to another (e.g. SHP to SQLite conversion) === |
| 117 | |
| 118 | This example copies verbatim the source data into the destination file. |
| 119 | Note that all that's needed is to call Insert() on the destination with the source |
| 120 | row. The insert call takes a flag indicating whether to preserve the column FID or not, |
| 121 | and optionally a pointer which is filled with the FID of the newly created feature in the |
| 122 | target data store. |
| 123 | |
| 124 | void ConvertFDOToFDO(const wchar_t* src, const wchar_t* dst) |
| 125 | { |
| 126 | printf("%ls ---> %ls\n\n", src, dst); |
| 127 | |
| 128 | //Create target data store |
| 129 | if (!DatabaseFdo::Create(dst)) |
| 130 | return; |
| 131 | |
| 132 | //open source and target connections |
| 133 | DbHandle srcdb = DbPool::GetConnection(src); |
| 134 | DbHandle dstdb = DbPool::GetConnection(dst); |
| 135 | |
| 136 | if (!srcdb) |
| 137 | return; |
| 138 | |
| 139 | if (!dstdb) |
| 140 | return; |
| 141 | |
| 142 | //copy the schema and coord sys defs |
| 143 | dstdb->SetSchema(srcdb->Def()); |
| 144 | |
| 145 | //for all feature classes |
| 146 | for (int q=0; q<dstdb->Count(); q++) |
| 147 | { |
| 148 | Table& dsttbl = (*dstdb)[q]; //get destination table |
| 149 | Table& srctbl = *(*srcdb)[dsttbl.Def().Name()]; //get corresponding source table by name |
| 150 | |
| 151 | int count = 0; |
| 152 | |
| 153 | //basic "select *" kind of query |
| 154 | srctbl.Query(); |
| 155 | //double bbox[4] = { 0, 0, 10000, 10000 }; |
| 156 | //srctbl.Query(bbox); //query with bounding box |
| 157 | |
| 158 | while (srctbl.Next()) |
| 159 | { |
| 160 | //directly insert source row into target table |
| 161 | //without any transformation |
| 162 | dsttbl.Insert(srctbl.At(), NULL, true); |
| 163 | |
| 164 | count++; |
| 165 | |
| 166 | if (count % 10000 == 0) printf ("# processed: %d\n", count); |
| 167 | } |
| 168 | |
| 169 | srctbl.EndQuery(); |
| 170 | |
| 171 | printf ("\n\nTotal feature count : %d\n", count); |
| 172 | } |
| 173 | } |
| 174 | |
| 175 | |
| 176 | == Performance Implications == |
| 177 | |
| 178 | As every wrapper, there will be some performance overhead to using this wrapper instead of FDO directly. |
| 179 | For simple queries accessing multiple column, this overhead is not very large, but it can be significant |
| 180 | in cases where only one or two columns out of a very wide row are needed by the caller. This is fundamentally |
| 181 | because the wrapper API always pre-fills all column values before returning a row -- something that |
| 182 | can be fixed at the price of adding code complexity. |
| 183 | |
| 184 | With OGR as backend, the performance overhead is very small, due to the proposed wrapper mapping almost 1:1 to the |
| 185 | underlying OGR API. The main overhead with OGR is geometry conversion from WKB to FGF format. |
| 186 | |
| 187 | The above weakness (pre-filling of the row) has the potential of becoming a strength if ever a direct backend |
| 188 | is implemented for sqlite for example. Under normal use cases, using FDO requires 2*N virtual function |
| 189 | calls to retrieve N column values of a row, while the wrapper API can do the same with a single virtual function call. |
| 190 | For very fast backends this can make a huge difference, in particular if column values are accessed by index (and not by name). |
| 191 | |
| 192 | |
| 193 | |
| 194 | == Attachments == |
| 195 | |
| 196 | Attached is a rough implementation of the wrapper which supports the examples shown above. |
| 197 | For all base API definitions, refer to BaseWrap.h. |
| 198 | Working backends are included for both FDO (FdoWrap.h/cpp) and OGR (OgrWrap.h/cpp). |
| 199 | The FDO backend recognizes SHP, SDF and SQLite connections only. |
| 200 | The OGR backend recognizes all OGR backends, but does not yet implement datastore creation. It does implement insert/update to existing data stores. |
| 201 | The connection pooling is implelmented in DbPool.h/cpp. |