Changes between Version 9 and Version 10 of GSoC/2021/RasterParallelization


Ignore:
Timestamp:
08/22/21 03:19:13 (3 years ago)
Author:
aaronsms
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GSoC/2021/RasterParallelization

    v9 v10  
    206206- r.patch - https://github.com/OSGeo/grass/pull/1782
    207207
    208 The work output was lower than what was proposed as the implementation turns out the be more challenging that what I have expected.
     208Firstly, I have greatly underestimated the complexity of the work. Up to 20 modules were initially proposed at first but after the second week. However, it became clear that we had to cut down on the number of target modules and focus more on designing the algorithms. The modules we targeted behave differently as compared to some modules that had received OpenMP support in the past such as r.sun. In particular, the modules need to keep the same of behavior of having low memory footprint even after the parallelization, unlike r.sun which loads the entire raster map in-memory.
     209
     210During the first half of the GSoC, with the mentors’ discussion, we have came out with three different approaches to introducing parallel support to r.neighbors. After benchmarking their performance and taking account of their memory/disk usage, we decided to settle with the last approach which requires us to add an extra parameter memory to allow users to adjust their memory consumption. With this approach, we have to allow the modules process the raster map by chunks. Once we settled about the design, we started applying the same approach to other similar modules with low memory footprints. For more information regarding the implementation, see Raster Parallelization with OpenMP.
     211
     212Furthermore, test scripts were included in the modules to ensure the consistency of the results. Benchmark scripts were added to allow users to easily benchmark the performance of the parallelization to monitor the speedup in their own local machine. User documentation were also modified to include sections detailing how to make use of the newly added features.
     213
     214In the future, more raster modules can be parallelized using similar approach. Then, we can consider tackling more complex modules such as r.watershed and r.mapcalc. Also, we could consider exploring 3D raster modules as well.
     215
     216Furthermore, when we implement parallelization for r.univar, we notice that modules that produce statistics involving arithmetics can often have floating point discrepancies when dealing with large summation. Because of this, computation using different number of threads will now produce different results due to having different order of arithmetics. One idea would be to introduce Kahan summation algorithm to reduce the floating point discrepancies. However, this still would not guarantee the consistency of results.
    209217
    210218GSoC Submission - https://aaronsms.github.io/gsoc/2021.html