Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#1957 closed defect (fixed)

v.in.ascii (points.c) does not import some numbers in attached example

Reported by: ychemin Owned by: grass-dev@…
Priority: normal Milestone: 6.5.0
Component: Vector Version: svn-trunk
Keywords: v.in.ascii Cc:
CPU: x86-64 Platform: Linux

Description

v.in.ascii input=9.csv output=rain_$(echo 9.csv | sed 's/\.csvg') separator=comma

Real number of columns: 4020


Maximum number of columns: 1366 Minimum number of columns: 1317

Attachments (1)

9.csv (11.7 KB ) - added by ychemin 12 years ago.

Download all attachments as: .zip

Change History (9)

by ychemin, 12 years ago

Attachment: 9.csv added

comment:1 by hamish, 12 years ago

Milestone: 7.0.06.5.0

(g6 filenames) in vector/v.in.ascii/points.c change buflen from 4000 to 24000 or some large enough value. consider also to increase BUFFSIZE from 128 in a2b.c and char buf[1000]; in in.c.

the dbf is still shortened, but that's a start.

Hamish

comment:2 by hamish, 12 years ago

max dbf columns somewhere 255-1028+

http://compgroups.net/comp.lang.clipper/maximum-fields-per-record-dbf/888510

for sqlite: "The default setting for SQLITE_MAX_COLUMN is 2000. You can change it at compile time to values as large as 32767."

http://www.sqlite.org/limits.html

PgSQL, "250 - 1600 depending on column types",

http://www.postgresql.org/about/

MySQL, "There is a hard limit of 4096 columns per table, but the effective maximum may be less for a given table."

http://dev.mysql.com/doc/refman/4.1/en/column-count-limit.html

how many met stations? if less than the number of time records maybe consider to invert the array and organize data by time instead of position? or split up into multiple files by e.g. year?

Hamish

comment:3 by hamish, 12 years ago

hmm, aside from hardcoded sscanf() buffers, is there a max number of columns in the grass 5 sites format?

comment:4 by hamish, 12 years ago

probably best to write a script to create one vector map per row, with columns 3-inf rotated into a single column; and the timestamps in a second column, matching the data by row number.

Hamish

in reply to:  description ; comment:5 by ychemin, 12 years ago

Replying to ychemin:

v.in.ascii input=9.csv output=rain_$(echo 9.csv | sed 's/\.csvg') separator=comma

Real number of columns: 4020


Maximum number of columns: 1366 Minimum number of columns: 1317

Merge Ticket 1958:

importing .csv made of several rows of 4020 columns each, import stops at row 18, looking into the file, it is near the 20000 character.

Same behavior in 6.4.2 (Ubuntu stable version)

After setting (points.c):

76 buflen = 50000;

then error becomes:

Number of columns: 4020 <- THIS IS GOOD DBMI-SQLite driver error: Error in sqlite3_prepare(): too many columns on rain_9 <- :-(

in reply to:  5 comment:6 by neteler, 12 years ago

Replying to ychemin:

Replying to ychemin:

v.in.ascii input=9.csv output=rain_$(echo 9.csv | sed 's/\.csvg') separator=comma

Real number of columns: 4020

...

After setting (points.c):

76 buflen = 50000;

For easier inspection: http://trac.osgeo.org/grass/browser/grass/branches/releasebranch_6_4/vector/v.in.ascii/points.c#L76

then error becomes:

Number of columns: 4020 <- THIS IS GOOD DBMI-SQLite driver error: Error in sqlite3_prepare(): too many columns on rain_9 <- :-(

As mentioned, SQLite doesn't support more than 2000 columns out of the box, for hints see

http://grasswiki.osgeo.org/wiki/GRASS_GIS_Performance#Maximum_Number_of_Attribute_Columns

While the buffer in points.c could probably be enlarged, the latter is a problem not related to GRASS GIS.

comment:7 by ychemin, 12 years ago

Resolution: fixed
Status: newclosed

I am closing this ticket, enough information is there for people to choose how to handle it re: sqlite.

comment:8 by hamish, 12 years ago

It is likely that even 255 columns will not be supported by grass, just because the buffers are too small. Better to make the bottleneck the DB backend not the GRASS frontend.

As a ballpark estimate, say 2 columns x,y at 10 chars wide, + 253 columns of varchar(255), with field seps, plus a DOS newline,

10 + 1 + 10 + 1 + 253*255 + 252 + 2 = 64791

I think it's also worth about the solutions of transposing the array and creating a script to make each data row its own map, with the constant-step time series as a single column not a series of individual rows.. work around the DB limitation by thinking of the problem in a different way..

Hamish

Note: See TracTickets for help on using tickets.