Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geocoded address conflicts with full address #53

Open
cbeddow opened this issue Dec 11, 2024 · 1 comment
Open

Geocoded address conflicts with full address #53

cbeddow opened this issue Dec 11, 2024 · 1 comment

Comments

@cbeddow
Copy link
Contributor

cbeddow commented Dec 11, 2024

This could be improved in the future. In cases where the full address has a street name and then house number, we see an OSM converted result that has the same street name split into "addr:street" but different "addr:housenumber". Sometimes different street and house number. The model should really just assume Overture is correct and only use the geocoder to know when to split it, but it does not seem to help if it generates a totally different number and street. These may be the literal location but the main idea should be to reformat the full string into a split street and housenumber matching the full one, than to suggest something different.

Maybe an idea is to take the geocoder result, and see the pattern in it like what order the street name and house number come in, and use this to split the original Overture address string properly.

image

image

@pvdosev
Copy link
Collaborator

pvdosev commented Dec 18, 2024

I decided to expose both the Overture and Nominatim outputs because about 1 in 10 points of interest have either instructions or junk text. I couldn't think of a good way to separate those from cases where the geocoding gives back the wrong address entirely and the street names are different.

I agree in the majority of cases looking for a number on the left or right, and splitting the address out would work. In case of no number, the string is often the address name as well. But completely replacing the geocoder output would be annoying in cases like this:

This has a description of the location:

This has the region name in the freeform input. I'm not sure that the street has a name in this case?

I looked up this address and it's on 6th street. A naive algorithm would put the 6 as a house number

I've also seen "streetName streetNumber postcode", just the city name, and such other ways of writing out addresses. Not to say it's impossible to have a generally good algorithm for this, but it'd be good to expose the geocoder output as well for edge cases. Especially outside the United States and western Europe those can be pretty common

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants