Opened 7 years ago
Closed 7 years ago
#3968 closed defect (worksforme)
postgis geocode bug - invalid input syntax for type numeric
Reported by: | hammeryosi | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.4.3 |
Component: | tiger geocoder | Version: | 2.4.x |
Keywords: | Cc: |
Description
I am using postgis with the tiger extension to geocode addresses in our database at Dataline. I have successfully used it for tens of millions of addresses before running into this issue with a specific address in North Carolina. The geocoder function fails when I specify max_results=1.
Here’s what happens without specifying it:
select geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'))
'("(6709,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD10000027ACEC8778C053C09480511ED27F4140,5)' '("(198,,Sarah,Ln,,,Rockfish,NC,28306,t,,)",0101000020AD1000007EBC9E12A8C353C07DC16057347D4140,71)'
And with max_results=1
select geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'), 1)
ERROR: invalid input syntax for type numeric: " " CONTEXT: PL/pgSQL function geocode_address(norm_addy,integer,geometry) line 204 at assignment PL/pgSQL function geocode(norm_addy,integer,geometry) line 14 at FOR over SELECT rows SQL state: 22P02
This does not happen for other addresses, e.g.,
select geocode(normalize_address('6922 SIMS SCHOOL RD, SIMS, NC 27880'), 1)
'("(6922,,""Sims School"",Rd,,,Sims,NC,27880,t,,)",0101000020AD100000D0EB95D6808353C0364C7D2CFEDF4140,0)'
My system info:
'PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit' 'POSTGIS="2.4.1 r16012" PGSQL="100" GEOS="3.6.2-CAPI-1.10.2 4d2925d" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.2, released 2017/09/15" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER'
Any help would be appreciated.
Thanks!
Yosi Hammer,
Data scientist at Dataline (http:
www.datalinedata.com)
Change History (4)
comment:1 by , 7 years ago
comment:2 by , 7 years ago
Hammer,
I loaded NC tiger 2017 data and tested, and not seeing an issue.
SELECT geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'), 1);
outputs:
geocode -------------------------------------------------------------------------------- ------------------------ ("(6922,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD100000B4AED5924DC1 53C0DFB3212DF17E4140,0) (1 row)
SELECT geocode(normalize_address('6922 SURREY RD, FAYETTEVILLE, NC 28306'));
outputs
geocode -------------------------------------------------------------------------------- ------------------------ ("(6709,,Surrey,Rd,,,Fayetteville,NC,28306,t,,)",0101000020AD10000027ACEC8778C0 53C09480511ED27F4140,5) ("(198,,Sarah,Ln,,,Rockfish,NC,28306,t,,)",0101000020AD1000007EBC9E12A8C353C07D C16057347D4140,71) (2 rows)
I'm running:
PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit POSTGIS="2.4.2 r16113" PGSQL="100" GEOS="3.6.2-CAPI-1.10.2 4d2925d" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.2, released 2017/09/15" LIBXML="2.7.8" LIBJSON="0.12" LIBPROTOBUF="1.2.1" RASTER
Very similar to yours, so I'm guessing you might be running an unpatched tiger geocoder. To check which version you are running, check what this outputs:
SELECT name, installed_version FROM pg_available_extensions WHERE name = 'postgis_tiger_geocoder';
name | installed_version ------------------------+------------------- postgis_tiger_geocoder | 2.4.1 (1 row)
I thought maybe it was something I fixed in 2.4.2, but as you can see I neglected to even upgrade my tiger to 2.4.2, so in theory I should be running the same version as you.
If you see something less than 2.4.1, then run this:
ALTER EXTENSION postgis_tiger_geocoder UPDATE;
and that should bring you up to date. Hopefully will fix your issue.
comment:3 by , 7 years ago
Hi, Thanks for your answer and sorry it took me a while to reply. Haven't got back to this until now.
Since posting this I set up another database with all states included. (Before I used different databases for different states to improve performance working on millions of addresses) The new database does not have this issue. The only difference between the setup process was using a different version of wget for downloading the census data. Could this have been something to do with a bad line in one of the census tables? I don't see how that would cause such a strange bug appearing only when specifying max_results=1.
In any case, since this bug is not reproducible even on my machine. I think we can close it.
Thanks again!
Yosi
comment:4 by , 7 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
hammeryosi,
No shouldn't have anything to do with wget. Perhaps your two different systems had two different versions of postgis_tiger_geocoder. Like I said, this bug looked familiar and one I had fixed.
I'll take a look at this hopefully in next 2 days. I thought this issue I had already fixed, but might be another instance of it. The issue is data specific probably some strange stuff in one of the street numbers in addr table it happens to hit during filtering.