Opened 13 years ago
Closed 9 years ago
#1384 closed defect (fixed)
Issue with geocoding addresses where place name doesn't match
Reported by: | robe | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.2.0 |
Component: | tiger geocoder | Version: | master |
Keywords: | Cc: | woodbri |
Description
Example boroughs of NY. Tiger just has these as New York , NY.
This geocodes fast:
select pprint_addy(addy), ST_AsText(geomout), rating FROM geocode('2601 24TH AVE, NY 111022337',2);
This geocodes slow and gives wrong answer
select pprint_addy(addy), ST_AsText(geomout), rating FROM geocode('2601 24TH AVE, ASTORIA, NY 111022337',2);
More examples on #1382
Attachments (1)
Change History (6)
comment:1 by , 13 years ago
by , 13 years ago
Attachment: | astoriaF.png added |
---|
comment:2 by , 13 years ago
Milestone: | PostGIS 2.0.0 → PostGIS 2.1.0 |
---|
comment:3 by , 12 years ago
Cc: | added |
---|
comment:4 by , 12 years ago
Milestone: | PostGIS 2.1.0 → PostGIS Future |
---|
comment:5 by , 9 years ago
Milestone: | PostGIS Future → PostGIS 2.2.0 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Okay this actually geocodes fine even though the place name doesn't match and the zip code is longer than 5 (presumably it should have a -). I checked google and it thinks that astoria should be 26-01. That's a separate issue that we don't deal with hyphenated street numbers.
So with tiger 2015 data (i have MA,MN,NY,PA,KS,RI loaded on my windows 7 64-bit 9.4 desktop) I get:
test_tiger=# select pprint_addy(addy), ST_AsText(geomout), rating FROM geocode('2601 24TH AVE, NY 111022337',2); pprint_addy | st_astext | rating --------------------------------+-------------------------------------------+-------- 0 24th Ave, New York, NY 11102 | POINT(-73.9211006425579 40.7761925056417) | 18 (1 row) Time: 22.624 ms test_tiger=# select pprint_addy(addy), ST_AsText(geomout), rating FROM geocode('2601 24TH AVE, ASTORIA, NY 111022337',2); pprint_addy | st_astext | rating ------------------------------+-------------------------------------------+-------- 24th Ave, New York, NY 11214 | POINT(-73.9883851176818 40.6000773220411) | 18 24th Ave, New York, NY 11204 | POINT(-73.9743513111828 40.6135516510796) | 21 (2 rows) Time: 8311.949 ms test_tiger=# select pprint_addy(addy), ST_AsText(geomout), rating FROM geocode('26-01 24TH AVE, ASTORIA, NY 11102',2); pprint_addy | st_astext | rating --------------------------------+-------------------------------------------+-------- 0 24th Ave, New York, NY 11102 | POINT(-73.9211006425579 40.7761925056417) | 17 (1 row) Time: 17.237 ms
So second takes much longer to process but still gives more or less right answer (given we can't handle hyphenated street numbers).
For compare
Google can't handle the first version of the address and gives nothing
the second it corrects and says
-73.918206, 40.7744398
and that the address is: 26-01 24th Ave Queens, NY 11102
So tiger geocoder is in right ball park if only we could do the right thing with the numbers. The last answer (where we feed in the correct representation of the address is pretty close to what google returns).
attached is a visual result of an experiment with Astoria, NY data. The red dots are authoritative addresses and locations, the blue dots have two labels: blue label is the address as supplied to geocode(); green label is the pprint_addy result from geocode; the blue dots are the location returned by geocode();