#198 closed defect (invalid)
v.in.ascii: column scanning is borked
Reported by: | hamish | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 6.4.2 |
Component: | Vector | Version: | svn-develbranch6 |
Keywords: | v.in.ascii | Cc: | martinl |
CPU: | All | Platform: | All |
Description
Hi,
this bug is related to the old RT bugs 2763 and 5209.
http://intevation.de/rt/webrt?serial_num=2763 http://intevation.de/rt/webrt?serial_num=5209
and the clumsy empty last-column work-around in v.in.gpsbabel:
http://trac.osgeo.org/grass/browser/grass/trunk/scripts/v.in.gpsbabel/v.in.gpsbabel#L298
"FIXME: if last field (comments) is empty it causes a not-enough fields error in v.in.ascii"
The column type scanning step in v.in.ascii's points mode no longer accepts empty columns as NULL, and imported tables have columns truncated. Note that passing empty values in double columns works in GRASS 6.2.3!
It would be nice to allow numeric columns as empty or 'NULL' for an empty record, and allow "nan" or "inf" without the scanning function deciding that the column contains strings. (For varchar columns the word 'NULL' should not be stripped however)
Input file:
cat << EOF > test.dat cat|x|y|name|value|count 1|2.3|4.5|Foo|3.1415|4 2|2.4|4.6|Bar||| EOF
Import without column declaration:
G64svn> v.in.ascii in=test.dat out=test_null_import skip=1 \ cat=1 x=2 y=3 --verbose Scanning input for column types... Maximum input row length: 25 Maximum number of columns: 6 Minimum number of columns: 6 Column: 1 type: integer Column: 2 type: double Column: 3 type: double Column: 4 type: string length: 3 Column: 5 type: string length: 0 Column: 6 type: string length: 0 Importing points... Populating table... Building topology for vector map <test_null_import>... 2 primitives registered Building areas: 100% 0 areas built 0 isles built Attaching islands: Attaching centroids: 100% Topology was built Number of nodes : 2 Number of primitives: 2 Number of points : 2 Number of lines : 0 Number of boundaries: 0 Number of centroids : 0 Number of areas : 0 Number of isles : 0 v.in.ascii complete. G64svn> v.info -c test_null_import Displaying column types/names for database connection of layer 1: INTEGER|int_1 DOUBLE PRECISION|dbl_1 DOUBLE PRECISION|dbl_2 CHARACTER|str_1
- what happened to columns 5 and 6?
Column: 5 type: string length: 0 Column: 6 type: string length: 0
- Columns 5 and 6 incorrectly scanned as (empty) "string" type.
Also, I am not sure if hiding the column scanning result behind --verbose mode is advisable, given that it is buggy and it is the first line of defense when the input file contains typos.
Import with column declaration:
G64svn> v.in.ascii in=test.dat out=test_null_import skip=1 \ cat=1 x=2 y=3 --verbose \ columns='cat int, x double, y double, name varchar(10), value double, count int' Scanning input for column types... Maximum input row length: 25 Maximum number of columns: 6 Minimum number of columns: 6 Column: 1 type: integer Column: 2 type: double Column: 3 type: double Column: 4 type: string length: 3 Column: 5 type: string length: 0 Column: 6 type: string length: 0 WARNING: Table <test_null_import> linked to vector map <test_null_import> does not exist ERROR: Column number 5 defined as double has string values
- in addition to previous errors the "table does not exist" warning's meaning is a mystery.
changing the empty "||"
to "|NULL|" doesn't help, the scanning step declares it as a string column (length: 4) and refuses to continue.
this is important code, so tread with greatest care.....
Hamish
Attachments (2)
Change History (14)
by , 15 years ago
Attachment: | v.in.ascii.patch added |
---|
comment:1 by , 15 years ago
Try attached patch for the missing values problem. NULL, nan or inf is still not recognized. There is however still a nonsense warning for completely empty columns declared double, but import is successful.
Markus M
follow-up: 5 comment:2 by , 15 years ago
Milestone: | 6.4.0 → 6.4.1 |
---|
patch applied in 6.5 and 7; looks like it's fine but deferring backport to relbr64 until 6.4.1 to allow more testing.
Hamish
comment:4 by , 15 years ago
Milestone: | → 6.4.1 |
---|
follow-up: 6 comment:5 by , 14 years ago
Cc: | added |
---|
Replying to hamish:
patch applied in 6.5 and 7; looks like it's fine but deferring backport to relbr64 until 6.4.1 to allow more testing.
it's already in relbr64. So can we close the ticket?
comment:6 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
follow-up: 8 comment:7 by , 14 years ago
Milestone: | 6.4.1 → 6.4.2 |
---|---|
Resolution: | fixed |
Status: | closed → reopened |
Still not working. Test data are LiDAR laz data available here
The file I used is srs.laz
The commands
las2txt -i srs.laz -o srs.ascii --parse xyztiaunrcCpedRGB --delimiter "|" # check ascii file head srs.ascii 289814.15|4320978.61|170.76|499450.80599405|260|||6|0|2|Ground|0|0|0|0|0|0 289814.64|4320978.84|170.76|499450.80600805|280|||6|0|2|Ground|0|0|0|0|0|0 289815.12|4320979.06|170.75|499450.80602205|280|||6|0|2|Ground|0|0|0|0|0|0 # import in GRASS las2txt -i srs.laz --stdout --parse xyztiaunrcCpedRGB --delimiter "|" | v.in.ascii in=- out=srs_ascii -z x=1 y=2 z=3 --o # only the first 5 columns were imported # check table contents v.db.select srs_ascii where="cat = 1" cat|dbl_1|dbl_2|dbl_3|dbl_4|int_1 1|289814.15|4320978.61|170.76|499450.80599405|260
Markus M
follow-up: 9 comment:8 by , 13 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
Replying to mmetz:
Still not working. Test data are LiDAR laz data available here
The file I used is srs.laz
[snip]
# only the first 5 columns were imported
It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.
Closing as invalid.
follow-up: 10 comment:9 by , 13 years ago
Replying to mmetz:
It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.
Hi,
instead of piping to v.in.ascii can you save to a file which we can have a peek at in hexdump? what version of las2txt? does the same happen with the snake lidar sample data from the grass wiki lidar page or just this dataset?
(if you found it others probably will too)
Hamish
follow-up: 11 comment:10 by , 13 years ago
Replying to hamish:
Replying to mmetz:
It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.
Hi,
instead of piping to v.in.ascii can you save to a file which we can have a peek at in hexdump? what version of las2txt? does the same happen with the snake lidar sample data from the grass wiki lidar page or just this dataset?
las2txt version: libLAS 1.6.1 with GeoTIFF 1.3.0 GDAL 1.8.0 LASzip 1.2.0
The same happens with "Serpent Mound Model LAS Data.las" from the grass wiki lidar page.
Attached is the las2txt output for srs.laz.
I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.
Markus M
follow-up: 12 comment:11 by , 13 years ago
Replying to mmetz:
Attached is the las2txt output for srs.laz.
I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.
correct. columns 6 and 7 are not empty.
as viewed in less
:
289814.15|4320978.61|170.76|499450.80599405|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289814.64|4320978.84|170.76|499450.80600805|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289815.12|4320979.06|170.75|499450.80602205|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289815.60|4320979.28|170.74|499450.80603605|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289816.08|4320979.50|170.68|499450.80605005|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289816.56|4320979.71|170.66|499450.80606405|240|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289817.03|4320979.92|170.63|499450.80607806|240|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289817.53|4320980.16|170.62|499450.80609206|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289818.01|4320980.38|170.61|499450.80610606|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0 289818.50|4320980.59|170.58|499450.80612006|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0
^@
means the null char.
I think it is reasonable for G_getl2() to stop on null terminators, and there's nothing more to do here but file a bug with las2txt
.
Hamish
comment:12 by , 13 years ago
Replying to hamish:
Replying to mmetz:
Attached is the las2txt output for srs.laz.
I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.
correct. columns 6 and 7 are not empty.
[snip]
I think it is reasonable for G_getl2() to stop on null terminators, and there's nothing more to do here but file a bug with
las2txt
.
I would rather call this a user error that I did because the proper way to do it would be to investigate the .la[s|z] file first with lasinfo
, decide what attributes I want to import based on the attributes available and then set the --parse options accordingly. Or use v.in.lidar which does it all automatically;-)
Markus M
patch for missing values