Figure 2: Sources and Fixes for Geocoding Errors
Potential Source of Error |
Approach or Solution to Resolve |
Addresses may be duplicated. More than one “Main Street” may exist in
a state. |
Use city names or ZIP codes to refine the possible area for address
matching. |
Addresses may be misspelled or inaccurately represented. For example,
“Maine Street” would not match “Main Street.” |
Software can be adjusted to overlook these minor differences in
spelling (although it creates a risk when these really are different
streets). |
Address files may include post office boxes rather than addresses.
Prefixes or suffixes may be missing altogether. Institutional names (e.g.,
a nursing home) or building numbers (e.g., apartment numbers) may not be
included. |
No easy fix. Need to find or generate an actual address. Adoption of
an address standard that requires that certain fields be filled in could
also assist. |
“Northwest” in the address file will not match “NW” in the road
database. |
Develop an “alias table” where the software is told that “Northwest”
and “NW” mean the same thing. |
Road databases may not be geographically accurate. The accuracy
depends primarily on how the road data were collected (e.g., via GPS,
digitized from a map, hand drawn). |
GPS tends to produce the most accurate geographic coordinates.
Small-scale maps (of a state or the nation) are much less accurate
geographically than larger scale maps (of a neighborhood or city). The
analysis being conducted determines the geographic accuracy needed. |
Roads may be missing (e.g., new subdivisions).
|
Determine the currency of data and review metadata. Examine recent
aerial photographs to identify missing features. |
Road databases may have incorrect attributes. Street names may not
be accurately encoded in the road database (missing or misspelled).
Rural route addresses are not typically included in road databases. |
Clean up the road database to meet the needs of the analysis. |
ZIP code boundaries can change frequently. |
Know the dates of both the address files and road databases and
ensure they are appropriate timeframes for geocoding. |
Geocoding against address ranges can introduce positional errors
because the software assumes equal distribution of addresses on a block.
This can be an issue in rural areas, where residences are not evenly
distributed, or in urban areas that have significantly different lot
sizes on a block. |
Encoding exact addresses via GPS is one solution. |
Geocoding software is based on proprietary approaches using various
assumptions to solve address or matching problems. The approaches are
not all the same, meaning that different coordinates may result when
address files are geocoded with different software packages. |
Know the vendor and the assumptions being made (algorithms being
used) in the software. |