Changes between Version 7 and Version 8 of GSoC/2021/RasterParallelization


Ignore:
Timestamp:
08/02/21 07:09:22 (3 years ago)
Author:
aaronsms
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GSoC/2021/RasterParallelization

    v7 v8  
    293293No, it has been good so far.
    294294
     295=== Week 5 ===
     296
     297'''1) What did I get done this week?'''
     298
     299To benchmark both r.mfilter and r.neighbor implementation, I have made use of the recently merged benchmark library on randomly generated raster using r.surf.fractal.
     300
     301The preliminary result is as follows for both modules (y-axis - time/secs, x-axis - nprocs | benchmarked on my local workstation):
     302Furthermore, checks are done to compare between performance on master branch vs after implementation (nprocs = 1), and the results are comparable.
     303
     304These two implementations make use of extensive disk I/O to write to temporary file buffer before transferring to the final raster file format. This behavior is default in r.mfilter, but is explicitly introduced in r.neighbors to allow for parallelization. Upon discussion with the mentors, we decided that we should make better use of memory over disk. Ideally, the user will be able to input the size of memory usage to be used for buffer. However, r.mfilter will still preserve its original usage of temporary files buffer.
     305
     306'''2) What do I plan on doing next week?'''
     307
     308    - Complete rework of r.neighbors implementation
     309    - Compare benchmark between the two implementations
     310
     311'''3) Am I blocked on anything?'''
     312
     313No major roadblock, but I need to catch up a bit to rework my r.neighbor implementation.
     314
     315=== Week 6 ===
     316
     317'''1) What did I get done this week?'''
     318
     319r.neighbors
     320
     321The main goal that I have accomplished is to do a complete rework of the r.neighbors implementation (PR: [https://github.com/OSGeo/grass/pull/1724]). A benchmark script is ready under 'benchmark' directory for users to test the performance on their local machine. The performance is comparable to the previous implementation that make use of temporary files as buffer (on SSD) instead of memory. The result of the benchmarking on my local machine (12 cores) under the PR.
     322
     323r.mfilter
     324
     325There are issues pointed out when working on raster files > 2GB (PR: [https://github.com/OSGeo/grass/pull/1708]). This is promptly addressed with commit (4caa96), and the cause is due to overflow from multiplication. This PR is ready, and a benchmark script is provided as well for local benchmarking.
     326
     327'''2) What do I plan on doing next week?'''
     328
     329    - Introduce an environment variable that overwrites the default nprocs parameter which is currently 1. This is so that the users do not need to add nprocs parameter explicitly.
     330    - Implement r.resamp.filter/r.resamp.interp parallelization
     331
     332'''3) Am I blocked on anything?'''
     333No major issues.
     334
     335=== Week 7 ===
     336
     337'''1) What did I get done this week?'''
     338
     339- Introduce an environment variable that overwrites the default nprocs parameter which is currently 1. This is so that the users do not need to add nprocs parameter explicitly.
     340
     341r.resamp.filter
     342
     343    - Implement parallelization
     344    - Add test cases
     345
     346'''2) What do I plan on doing next week?'''
     347
     348    - Implement parallelization for r.slope.aspect with testing and benchmarking
     349
     350'''3) Am I blocked on anything?'''
     351
     352No major issues.
     353
     354=== Week 8 ===
     355
     356'''1) What did I get done this week?'''
     357
     358r.resamp.interp [https://github.com/OSGeo/grass/pull/1771]
     359
     360- Implement parallelization
     361
     362r.slope.aspect [https://github.com/OSGeo/grass/pull/1767]
     363
     364- Implement parallelization
     365
     366Both implementation above follows similarly to r.neighbor [https://github.com/OSGeo/grass/pull/1724]. r.slope.aspect keeps track of global statistics variable like min/max, thus additional variable reduction is required aside from map computation. The benchmarking of the modules will be supplemented in the PR.
     367
     368'''2) What do I plan on doing next week?'''
     369
     370    - Refactor r.univar
     371    - Implement parallelization for r.series, r.patch
     372    - Revisit r.proj to decide on implementation
     373
     374'''3) Am I blocked on anything?'''
     375
     376No major issues.
     377
    295378
    296379== Final report ==