Version 11 (modified by 3 years ago) ( diff ) | ,
---|
Introduction
Idea
In this project, I plan to Implement pre-sorting method in z-order pattern or Hilbert order pattern to improve the performance of GiST index building period.
Project proposal
My proposal for GSoC 2021 can be found at https://docs.google.com/document/d/1_mY_F2hPDk3vmXH5PPp2z9BuQWt-ZMORk6KxtdVQ3HY/edit?usp=sharing.
Link to Github repository: https://github.com/HanwGeek/postgis
Timeline
17th May - 7th June
Community bonding period:
- I have Introduced myself over the channel and shared my proposal over the mailing list for suggestions.
- Communicating with mentors and learned about community, working, etc. It is a great experience talking with experts in the domain.
- Forked the repository of PostGIS https://github.com/HanwGeek/postgis
- Updated wiki User page and added my personal information https://wiki.osgeo.org/wiki/User:HanwGeek
- Currently understanding the codebase.
Coding Week 1 (7th June - 13th June)
Coding Phase :
- Create sort support function bindings for
geometry
data type - Implement a
abbrev_convert
with the current Hilbert hash function - Test random point data set with sort support function
- Communicate with mentors and the Postgres/PostGIS community for more information and help
Plans for next week:
- Push current version of codes to my branch
- Prepare more complex test data
Coding Week 2 (14th June - 20th June)
Coding Phase :
- Fix the signature bug of sort support function
- Test random data of 10,000 point with pre-sort support function
- Communicate with mentors and the Postgres/PostGIS community for more information and help
Plans for next week:
- Ask for the suggestion about the Hilbert function signature from PostGIS community
- Refactor the code structure
- Create the signature of z-order hash function
Coding Week 3 (21th June - 27th June)
Coding Phase :
- Refactor the code structure, moving hash function to
inline
module - Create a pull request and receive suggestions from community and mentors
- Finish morton hash function
Plans for next week:
- Create larger random data or use real-world data for testing
- Evaluate the stability and efficiency of hash functions
- Searching for a more efficient hash function
Coding Week 4 (28th June - 4th July)
Coding Phase :
- Finish hash function
- Create
FlameGraph
for cpu time analysis - Prepare io access test
Plans for next week:
- Check the performance of hash function in detail
- Prepare for evaluation of performance and logic
Coding Week 5 (5th July - 11th July)
Coding Phase :
- Check the original paper of
GiST
in details - Check the implement of
GiST
in Postgresql - Prepare for evaluation of IO access
Plans for next week:
- Do the IO(buffer) access test
- Prepare for GSoC evaluation
- Confirm the proper hash function
Coding Week 6 (12th July - 18th July)
Coding Phase :
- Prepare for GSoC evaluation
- Test the IO access with
EXPLAIN
Plans for next week:
- Confirm the proper hash function
- Do more research on hash function
- Go deep into
GiST
in postgres
Coding Week 7 (19th July - 25th July)
Coding Phase :
- Check the implementation of
GiST
in Postgres - Test the
Buffer hit
andExecution Time
of index query performance with random data
Plans for next week:
- Do more detailed tests to figure out the reason of performance test results
- Modify the hash functions
Coding Week 8 (26th July - 1st August)
Coding Phase :
- Complete the tests of
Buffer hit
andExecution Time
of index query performance with random data - Write a [document](https://docs.google.com/document/d/1m4oxBAsKCyjAnYmkCmQ0X_ltiid5tliFwF3rtdzlKsc/edit?usp=sharing) with the test result
Plans for next week:
- Connect with mentors and community for the decision of pre-sort methods
- Check the page status of different pre-sort methods.
Coding Week 9 (2th August - 8st August)
Coding Phase :
- Apply
pageinspect
to debug the pre-sorting methods - Do more traversal tests
- Propose a issue about
gist_page_items
function
Plans for next week:
- Fix the bugs in the pages of pre-sorting methods
- Do more detailed tests
- Organize existing code and test cases
Plans for next week:
- Connect with mentors and community for the decision of pre-sort methods
- Check the page status of different pre-sort methods.
Student's Biography
My name is Han WANG. I am a first year graduate student majoring in GIS at Peking University, and will get my Master's degree in 2023. And this is my github(https://github.com/HanwGeek) and my linkedin(https://www.linkedin.com/in/hanwgeek/). I am interested in all cool things. And it is very exciting to join the open source community! My research interest includes massive spatial temporal data management and analysis. Currently, I am working on a machine learning project based on big trajectory data, which is stored in PostgreSQL database and managed by PostGIS.
Mentors
*Regina Obe https://wiki.osgeo.org/wiki/User:Robe *Giuseppe Broccolo
[[Category: Google Summer of Code]]