wiki:FDORfc16

Version 28 (modified by traianstanev, 15 years ago) ( diff )

Make wording clearer.

FDO RFC 16 - FDO Provider for SQLite

This page contains a request for comments document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the RFCs page.

Status

RFC Template Version(1.0)
Submission Date2008-03-18
Last ModifiedTraian Stanev 2008-03-29
AuthorTraian Stanev
RFC StatusAdopted
Implementation Statusunder development
Proposed Milestone3.4.0.0
Assigned PSC guide(s)Jason Birch
Voting History2008-04-03
+1Greg, Orest, Jason, Mateusz, Haris, Bob, Frank
+0
-0
-1

Updates

  • April 20, 2008 (jbirch) - Modified to explicitly specify the contents of the spatial_ref_sys table. Changed VARCHAR references to TEXT because VARCHAR is only a valid SQLite type through inference.

Overview

This RFC proposes a new FDO provider for accessing (spatial) data stored in SQLite database files.

Motivation

SQLite is an efficient and popular way to store data. Support for accessing SQLite databases via FDO would expand the range of data sources supported by FDO. A SQLite format for storing spatial data would make a good interchange format, similar to, but more flexible than Shapefiles.

What makes a valid spatial SQLite database?

The goal of this simple specification is to allow access to SQLite databases in as open and standard way as possible. This document does not conform to the OGC Simple Features for SQL implementation specification (06-104r3), but it does leverage the specification where practical.

Here is the minimum set of characteristics expected in SQLite database:

  • well-defined subset of the geometry_columns table (as per SF spec, section 7.1.3.2, 7.1.3.3); see below for specific column requirements,
  • spatial_ref_sys table (as per SF spec, section 7.1.2.2, 7.1.2.3). The unique SRID and valid WKT spatial reference text MUST be defined for each unique coordinate system used in the database.
  • geometry_format column extending the geometry_columns table, set to one of WKB, WKT, WKB12, WKT12 or FGF. If the column is not present or the value is null, WKB will be assumed.
  • one or more tables to store the geometry and associated features. Although in general type does not matter in SQLite, BLOB type is recommended at least for WKB and FGF formats as it does not perform character translation on the data.

Note that this implementation does not currently include support for the enhanced multi-dimensional WKB and WKT formats that are defined in the OGC Simple Feature Access specification 06-103r3. The tags WKB12 and WKT12 are reserved for future use in support of this enhanced format. FGF refers to the FDO Binary Geometry Format.

geometry_columns Subset

SQLite databases require the following fields in their geometry_columns table. Other fields are permitted, but will not affect data access. For detailed explanations of all of the standard fields (everything but geometry_format), see the SFSQL 06-104r3 document, section 7.1.3.3.

Column Name Type Notes
f_table_name TEXT name of a feature table with a geometry column.
f_geometry_column TEXT name of a geometry column in the feature table
geometry_type INTEGER integer value representing the form of geometry (1=POINT, 2=LINESTRING, etc). See Table 4 on Page 29-31 of the SFSQL 06-104r3 spec for a full list.
coord_dimension INTEGER dimensionality of the geometric features: 2, 3 or 4
srid INTEGER foreign key into the spatial_ref_sys table
geometry_format TEXT One of "WKB", "WKT", "WKB12", "WKT12", "FGF"

The suggested definition is:

CREATE TABLE geometry_columns (
            f_table_name TEXT, 
            f_geometry_column TEXT, 
            geometry_type INTEGER, 
            coord_dimension INTEGER, 
            srid INTEGER,
            geometry_format TEXT )

spatial_ref_sys

SQLite databases require the following fields in their spatial_ref_sys table.

Column Name Type Notes
srid INTEGER an integer value that uniquely identifies each Spatial Reference System within a database
auth_name TEXT the name of the standard or standards body that is being cited for this reference system. EPSG would be an example of a valid AUTH_NAME.
auth_srid INTEGER the ID of the Spatial Reference System as defined by the Authority cited in AUTH_NAME. 26910 would be an example of a valid EPSG srid (UTM Zone 10 NAD830).
srtext TEXT The Well-known Text Representation of the Spatial Reference System.

The suggested definition is:

CREATE TABLE spatial_ref_sys (
             srid INTEGER UNIQUE,
             auth_name TEXT,
             auth_srid INTEGER,
             srtext TEXT  )

Proposed FDO Provider

The proposed SQLite provider will have the following features:

  • Data stored and consumed in native SQLite format, and therefore accessible to other SQLite-based applications;
  • Geometry will be serialized in FDO geometry format (FGF), WKB, or WKT and stored in a BLOB column. The type of geometry encoding will be specified in the geometry_columns table in a column named geometry_format. In addition, the provider may support point geometry with coordinates stored in non-geometric columns, via a connection string setting.
  • A non-persistent spatial index usable for BBOX queries, built on the fly and valid for the duration of the FDO connection;
  • Execution of arbitrary SQL commands;
  • Metadata about feature geometry and coordinate systems will be stored in OGC geometry_columns and spatial_ref_sys tables;
  • A tentative list of supported FDO commands : DescribeSchema, Select, SelectAggregates, Insert, Update, Delete, GetSpatialContexts, arbitrary SQL, ApplySchema, CreateDataStore;

FDO features the provider will initially *not* support:

  • Feature class inheritance;
  • Geometry filters other than BBOX;
  • FDO Transactions -- but you can still manually execute BEGIN/COMMIT around other FDO commands;
  • Associations;
  • Constraints;
  • Any other exotic feature you can think of.

Implications

This is a new provider, so there will be no issues with backwards compatibility. We need to coordinate with the GDAL/OGR project (and any other data access libraries that express an interest) to make sure the metadata and geometry formats are usable cross-platform.

Implementation Plan

This section goes into some detail about how FDO constructs will be implemented with a SQLite back end.

  1. Spatial Context

In FDO schemas, the spatial context association of geometry properties is the string name of the spatial context. This will have to be mapped to an SRID integer, which is what the spatial_ref_sys table and geometry_columns table use to link geometry column to its spatial ref sys.

  1. FDO Data type mappings

SQLite only supports integer, real, string and blob data types. All integer FDO data types will map to SQLite integers. Double and Single will map to Real. DateTime will map to a string, using ISO 8601 standard date time format. Decimal maps to Real. BLOB and geometry map to BLOBs and strings map to string.

  1. Insert command

The provider will feature the following peculiarity of the Fdo Insert command implementation: if you reuse the same FdoIInsert instance to insert several features, these will be inserted within the same SQLite transaction, which closes when the command object is disposed. Inserts are extremely slow in SQLite unless they are batched into a single transaction, and this behavior of insert will help bulk feature inserts.

  1. Select and Select Aggregate filters

Initially, FDO filters will be converted to SQL where clauses directly using the ToString() method. Therefore, any functions used within the clause need to be available or registered with the SQLite SQL engine. SQLite does have a way to bind external functions to the SQL engine, which is how we can implement most of the FDO expression functions (like sin, cos etc).

There will be code which detects and extracts BBOX spatial conditions from the filter before doing the above conversion to string. The provider will use its spatial index evaluate the spatial component of the filter. The SQLite SQL engine will be used for evaluating the remaining part of the filter.

This approach is the same as the one used in the OGR provider.

  1. Multithreading

The provider will not be multi-threaded. For simulataneous access, to the same database, one would have to open two connections.

  1. Spatial Index

The provider will initially use an in-memory spatial index, built up during the first spatial filter request. Serialization of the spatial index is possible, but will not be initially implemented. Once we have spatial index serialization, multiple connections will be able to use the same index via memory mapping. Until then, each connection will have its own spatial index. Due to 5., this may end up consuming more memory than it should.

  1. ApplySchema

The initial implementation of ApplySchema will allow addition of new feature classes, but not modification of existing ones. If a feature class specified in the ApplySchema schema already exists, it will be skipped.

  1. CreateDataStore

This command will create a new sqlite database and also create empty spatial_ref_sys and geometry_columns table. If the database file already exists, the two tables will be created in the existing database.

  1. Autogenerated ID

The provider will not support composite keys. The recommended key will be auto-increment integer (32 or 64 bit). This will map directly to the internal SQLite row id. It will be possible to use other key types via the SQL UNIQUE constraint on the key column.

  1. SQLite library

Some modifications to and creative uses of SQLite's VDBE (SQL engine) will be necessary in order to improve read performance. Therefore the provider will use its own copy of SQLite. Note that this does not affect the file format in any way, just improves read performance when reading from within an FdoIFeatureReader.

Test Plan

There will be a small test suite that tests often used functionality. Arbitrary SQL execution will not be tested extensively.

Funding/Resources

References

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.