Opened 16 years ago
Closed 7 years ago
#542 closed enhancement (fixed)
grass7 vector libraries modifications
Reported by: | mmetz | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 8.0.0 |
Component: | Vector | Version: | svn-trunk |
Keywords: | Cc: | martinl | |
CPU: | All | Platform: | All |
Description
I want to suggest some more profound changes to the vector model for grass7. These changes would affect topology, spatial index and maybe category index, but not the coor file. That means that there will be limited forward/backward compatibility: topology would need to be rebuilt before vectors can be accessed. Vector modules would not need to be rewritten, but more efficient library functions could be made available.
My general idea/complaint is that the current topology layout is not tailored towards vector object types; instead several (very) different types (points, lines, boundaries, centroids, faces, kernels) are stored in the same structure. Working with one particular type is a bit inefficient because the desired type has to be selected out of everything stored in this universal structure every single time. I am sure that a lot of time and space can be safed with a redesigned topology layout and vector libraries that make use of it. As an example, what I want to get rid of is
for (line = 0; line < nlines; line++) { if (!Vect_line_alive(Map, line)) continue; type = Vect_read_line(map, points, cats, line); if (!(type & otype)) continue; /* process line */ }
The whole coor file is read, in the worst case e.g. just to get the few centroids in it. This can not always be avoided or changed, but could often be replaced with e.g.
for (centroid = 1; centroid < ncentroids; centroid++) { /* process centroid */ }
The current implementation has some consequences of which I am not sure if they are actually desired. E.g. when cleaning a vector with tool=snap (snapping vertices of lines and boundaries), lines and boundaries may be snapped together at the same time: a boundary may be snapped to a line and vice versa. Maybe this is sometimes desired, but maybe this should be avoided? Another example is removing duplicates: currently it is possible to do that for points and centroids together, and if there are a point and a centroid with identical coordinates, one of them is deleted (random selection).
With the changes I have in mind, the size of support structures should generally go down, most for point datasets, least for areas. Massive point datasets like LIDAR could be easier processed on level 2 with topology, because support structures for massive point datasets would be reduced in size by about 70% (rough estimates: spatial index reduced down to 25%, topology reduced down to 40%).
There are however some problems with my suggestions: 1) IMHO nobody should decide on that alone, 2) the coding is too much for one person alone, e.g. I can't do all that without help, 3) I'm not really a programmer, 4) I don't know enough about vector geometry algorithms.
Below are more technical details:
Status quo
the coor file holds lines (better: primitives) of types
point
line
boundary
centroid
face (3D boundary, not yet implemented)
kernel (3D centroid, not yet implemented)
structures derived from these types are
nodes
areas
isles
edges (3D areas, not yet implemented)
volumes (3D shapes, not yet implemented)
holes (3D volumes within volumes, like isles in areas, not yet implemented)
topology holds information about
nodes
lines
areas
isles
where lines can be points, lines, boundaries, centroids, faces, or kernels
see http://trac.osgeo.org/grass/browser/grass/trunk/include/vect/dig_structs.h#L440
points, lines, boundaries, centroids, faces, kernels are obviously different things, but the current topology layout squeezes all of them into the same structure with information about:
start node (assigned for all types, but not needed for points, centroids, kernels)
end node (used for lines and boundaries, otherwise unused)
area to left (for boundary, area for centroid, unused for all other types)
area to right (for boundary, unused for all other types)
3D bounding box (completely redundant for points, centroids, kernels)
offset (into coor file)
type (point, line, boundary, centroid, face, or kernel)
Proposed new layout
the coor file would hold the same types as before. To avoid confusion, all coordinate strings would be referred to as primitives (like in the output of current v.build), but that's just naming. IMHO anything but line is fine. A line can be a line or boundary or point or ... is too philosophical for my taste.
topology would have a separate data structure for each of
points
lines
boundaries
nodes (only needed for lines, boundaries, and faces)
centroids
areas
isles
faces
edges
volumes
holes
An additional small data structure would be needed that would be a boiled down replacement of current P_Line with information about primitives.
Similarly, a separate spatial index would be created for each type separately, instead of lumping all points, lines, boundaries, centroids, faces, and kernels into the same spatial index. It is more efficient with regard to time and space if separate spatial indices are maintained.
I'm reaching limits on what I can change in the vector libs without breaking compatibility, and I'm sometimes getting frustrated with the waste of time and space for large vectors. IIUR grass7 is an opportunity to introduce changes like these, so I hope to initiate a discussion and for more ideas on how to improve grass vector handling.
Regards,
Markus M
Change History (9)
comment:1 by , 16 years ago
comment:2 by , 15 years ago
Cc: | added |
---|---|
Priority: | minor → major |
follow-up: 4 comment:3 by , 15 years ago
Replying to mmetz:
face (3D boundary, not yet implemented)
edges (3D areas, not yet implemented)
Just very little note: shouldn't be
- face 3D area
and
- edge 3D boundary
?
Thanks, Martin
follow-up: 5 comment:4 by , 15 years ago
Replying to martinl:
Replying to mmetz:
face (3D boundary, not yet implemented)
edges (3D areas, not yet implemented)
Just very little note: shouldn't be
- face 3D area
and
- edge 3D boundary
?
You're right, yes. The documentation is a bit contradicting, e.g. in GRASS7 programmer's manual it says "Face and kernel are 3D equivalents of boundary and centroid...", but edge is commonly used as 2D/3D boundary (Vector network analysis, general graph theory). Thus for 3D we would have edges in the coor file, faces (3D areas) are going to be constructed from edges and volumes are going to be constructed from faces and kernels. Is this nomenclature ok?
Markus M
comment:5 by , 15 years ago
You're right, yes. The documentation is a bit contradicting, e.g. in GRASS7 programmer's manual it says "Face and kernel are 3D equivalents of boundary and centroid...", but edge is commonly used as 2D/3D boundary (Vector network analysis, general graph theory). Thus for 3D we would have edges in the coor file, faces (3D areas) are going to be constructed from edges and volumes are going to be constructed from faces and kernels. Is this nomenclature ok?
Seems to be reasonable to me.
Martin
follow-up: 9 comment:6 by , 9 years ago
I am not sure if it is still relevant to GRASS 7. Probably moving this ticket to milestone GRASS 8 would make better sense?
comment:7 by , 9 years ago
Milestone: | 7.0.0 → 7.0.5 |
---|
comment:8 by , 8 years ago
Milestone: | 7.0.5 → 8.0.0 |
---|
comment:9 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Replying to martinl:
I am not sure if it is still relevant to GRASS 7. Probably moving this ticket to milestone GRASS 8 would make better sense?
The proposed changes for 2D topology have been implemented in GRASS 7. A new ticket should be opened for 3D topology if the need arises. Closing ticket.
As no one has ever reacted to this, I will now: I'm not a real programmer, so won't be able to say much about the details, but from my limited experience, what you propose sounds very reasonable and I encourage you to go on in that direction.
Moritz