Opened 15 years ago
Closed 15 years ago
#873 closed enhancement (fixed)
r.univar: allow percentile= to be doubles
Reported by: | hamish | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 6.5.0 |
Component: | Raster | Version: | svn-develbranch6 |
Keywords: | r.univar, stats | Cc: | |
CPU: | All | Platform: | All |
Description
Hi,
for r.univar
it is useful to have sub-integer percentile output, e.g. for calculating confidence intervals/sigma at 68.2689%, 95.45%, and 99.73%. Currently r.univar
only accepts whole integers for the percentile=. Actually it's worse: it accepts floats on the command line but they quietly get truncated into integers.(!)
see http://article.gmane.org/gmane.comp.gis.grass.user/20744
currently r.colors.stddev
is hit by this and probably anyone doing stats on map values.
the attached patch (vs. 6.5svn) attempts to fix this. please review. R-bridge users please compare with R's answers.
#spearfish g.region rast=elevation.10m r.univar elevation.10m perc=68,68.2689,95,95.45,99,99.73 -e
Hamish
Attachments (1)
Change History (5)
by , 15 years ago
Attachment: | runiv_perc_dbl.diff added |
---|
follow-up: 2 comment:1 by , 15 years ago
Replying to hamish:
for
r.univar
it is useful to have sub-integer percentile output, e.g. for calculating confidence intervals/sigma at 68.2689%, 95.45%, and 99.73%. Currentlyr.univar
only accepts whole integers for the percentile=. Actually it's worse: it accepts floats on the command line but they quietly get truncated into integers.(!)
See http://article.gmane.org/gmane.comp.gis.grass.user/20744
Currently
r.colors.stddev
is hit by this and probably anyone doing stats on map values.
Note that r.quantile
supports non-integer percentiles, works with large maps, and is significantly more efficient.
comment:2 by , 15 years ago
Replying to glynn:
Note that
r.quantile
supports non-integer percentiles, works with large maps, and is significantly more efficient.
IMHO, it would still be nice if r.univar
supports floating point percentiles. The r.univar
manual could mention that r.quantile
is the better choice if only quantiles and/or percentiles are requested, e.g. under NOTES where it already says that r.univar
can use large amounts of system memory for extended statistics. I would suggest to also add r.quantile
to the SEE ALSO section.
Markus M
comment:3 by , 15 years ago
patch committed to 6.5svn with r40939. (will port to trunk soon)
R-bridge users please compare with R's answers.
I'd still like verification before considering if this should go in 6.4 or not. I'd settle for the list of commands to test it myself.
mmetz:
The r.univar manual could mention that r.quantile is the better choice if only quantiles and/or percentiles are requested
added in the patch.
Hamish
comment:4 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
backported to relbr64 with r41377.
Tested in 6.5svn against R's quantile(). mean residuals less than 3mm for Spearfish's elevation.10m.
test method follows.
GRASS 6.5svn:
#spearfish g.region rast=elevation.10m Q=`seq -s, 0 5 100` r.univar -e elevation.10m percentile=$Q -g n=2654802 null_cells=0 cells=2654802 min=1061.064087 max=1846.743408 range=785.679321 mean=1348.37144603811 mean_of_abs=1348.37144603811 stddev=175.494309916948 variance=30798.2528132259 coeff_var=13.0152793158443 sum=3579659211.6848597527 first_quartile=1196.8 median=1309.37 third_quartile=1480.29 percentile_0=1061.06 percentile_5=1123.09 percentile_10=1153.04 percentile_15=1171.19 percentile_20=1184.32 percentile_25=1196.8 percentile_30=1210.82 percentile_35=1229.45 percentile_40=1251.94 percentile_45=1280.39 percentile_50=1309.37 percentile_55=1346.4 percentile_60=1378.71 percentile_65=1410.04 percentile_70=1440.23 percentile_75=1480.29 percentile_80=1525.85 percentile_85=1568.51 percentile_90=1613.6 percentile_95=1671.23 percentile_100=1846.74 Q=`seq -s, 0 0.1 1` r.univar -e elevation.10m percentile=$Q -g n=2654802 null_cells=0 cells=2654802 min=1061.064087 max=1846.743408 range=785.679321 mean=1348.37144603811 mean_of_abs=1348.37144603811 stddev=175.494309916948 variance=30798.2528132259 coeff_var=13.0152793158443 sum=3579659211.6848597527 first_quartile=1196.8 median=1309.37 third_quartile=1480.29 percentile_0= 1061.06 percentile_0_1=1073.82 percentile_0_2=1078.85 percentile_0_3=1083.27 percentile_0_4=1085.54 percentile_0_5=1087.8 percentile_0_6=1090.37 percentile_0_7=1092.09 percentile_0_8=1093.77 percentile_0_9=1095.2 percentile_1= 1096.48
R 2.7.1:
library(spgrass6) G <- gmeta6() spear <- rast.get6("elevation.10m", cat = FALSE, ignore.stderr = TRUE) summary(spear) Q5 <- quantile(spear$elevation.10m, probs=seq(0,1,0.05), na.rm=TRUE, names=TRUE, type=8) cat(Q5, sep="\n") 1061.064 1123.092 1153.037 1171.192 1184.320 1196.800 1210.821 1229.449 1251.937 1280.389 1309.368 1346.401 1378.709 1410.040 1440.232 1480.286 1525.850 1568.510 1613.603 1671.230 1846.743 Q1 <- quantile(spear$elevation.10m, probs=seq(0,0.01,0.001), na.rm=TRUE, names=TRUE, type=8) cat(Q1, sep="\n") 1061.064 1073.824 1078.849 1083.274 1085.538 1087.803 1090.373 1092.091 1093.775 1095.205 1096.479
Q5_G = [ ... 1061.06 1123.09 1153.04 1171.19 1184.32 1196.8 1210.82 1229.45 1251.94 1280.39 1309.37 1346.4 1378.71 1410.04 1440.23 1480.29 1525.85 1568.51 1613.6 1671.23 1846.74 ]; Q5_R = [ ... 1061.064 1123.092 1153.037 1171.192 1184.320 1196.800 1210.821 1229.449 1251.937 1280.389 1309.368 1346.401 1378.709 1410.040 1440.232 1480.286 1525.850 1568.510 1613.603 1671.230 1846.743 ]; Q1_G = [ ... 1061.06 1073.82 1078.85 1083.27 1085.54 1087.8 1090.37 1092.09 1093.77 1095.2 1096.48 ]; Q1_R = [ ... 1061.064 1073.824 1078.849 1083.274 1085.538 1087.803 1090.373 1092.091 1093.775 1095.205 1096.479 ]; %summarize residuals >> mean ( abs(Q5_G - Q5_R) ) ans = 0.001 meter >> std( abs(Q5_G - Q5_R) ) ans = 0.0014 >> mean ( abs(Q1_G - Q1_R) ) ans = 0.003 meter >> std( abs(Q1_G - Q1_R) ) ans = 0.0015
Hamish
patch to allow r.univar to use percentile=double