Google Summer of Code 2021
Introduction
So you are interested in becoming a Google Summer of Code student. What should you do to improve your chances of being selected? We recommend reading
- OSGeo's GSoC Recommendations for Students
- We currently have one project in mind listed below, with a mentor ready to help - refer to Idea 1 below. We are open to other ideas, but Idea 1 takes precedence since we already have a mentor for that.
- Join the PostGIS Developers list and describe your proposed project (or willingness to work on Idea 1 listed below). We will let you know if we think the project is worthwhile and doable within the allotted time you have.
- If you are looking for additional ideas, refer to our past GSOCS:
Improving your chances
For most projects involving PostGIS you will eventually need the following:
- Know how to install PostgreSQL
- Know how to install PostGIS in PostgreSQL
- Know how to compile PostgreSQL code
- Know how to compile PostGIS code and run tests
- Some basic knowledge of git -- at least how to do a git clone, git push, git pull and pull requests
While you can learn to do these things and ask questions, we would prefer students to know these before starting on a PostGIS project.
Idea 1: Augment PostGIS 3.2 with GIST support added to PG14
Expected outcome: Speed up GiST index building in PostGIS
Skills required: C or willing to learn, ability to compile PostgreSQL code, ability to compile PostGIS code, some familiarity with PostGIS / PostgreSQL is preferable
Mentors: Giuseppe Broccolo, Regina Obe
Difficulty: Medium
Student Test:
- git clone PostGIS code from one of Git repos
- git clone code from PostgreSQL git repo (master branch) -
- Compile both and install PostGIS 3.2 (master branch) extension into PostgreSQL 14 dev database
- Setup a public fork of PostGIS repo for your work
Additional detail:
Recently this patch <https://commitfest.postgresql.org/29/2276/> which adds more infrastructure to the GiST has been included in PostgreSQL 14. It should speed up the build of a GiST index after some (fast) pre-sorting of the data which needs to be indexed. Some tests for the PostgreSQL's internal type point (that uses Zsort of the points as fast pre-sorting of the data) showed that the build is up to 5 times faster.
We need to find a possible implementation for PostGIS data types as well, finding the best algorithm to be used to preliminary sort the geometries before the build of the geospatial index. Basically, it would require to add this support function <https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L37> and this other one <https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L38> in the operator classes which use the GiST infrastructure (e.g. this one <https://github.com/postgis/postgis/blob/8b13c3e2f8366d902dbf516ec17de09ae84361f4/postgis/postgis.sql.in#L781>).
Tests are needed in order to quantify the improvements in performance during the build of the index, considering the different geometries.