-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duplicated results with different descriptions #3
Comments
This is a problem in the extracted data where the same URI is used for different entries. This results in several solutions when using SPARQL which appear to be duplicates in the UI, but have for instance different descriptions or images. |
Ok, great. That means if we fix the issue with same URIs that should go away? I am not 100% convinced, e.g., for right now queries such as "houses in headington" say "using fallback" and then return Horton Hill, Horton Cum Studley, OX33 x 7 then Land For SalePortland Road, Milcombe, Banbury, OX15 x 6 And so on and so forth. That seems more than the possible URI overlap. |
Yes, there are also duplicates in the Lucene index which is used as fallback. Have to check why this happens. |
Ok, the duplicates in the fallback Lucene index occur because of the duplicates in the extracted data. I avoid this now by only indexing 1 document per distinct URI, but this indeed lowers the recall. |
the following query "houses in Summertown" retrieves several times the two properties:
Water Eaton Road, Summertown OX2
£399,950.00
Street: Water Eaton Road, Summertown OX2
bedrooms: 2
bathrooms: 1
Divinity Road, Cowley OX4
£399,950.00
Street: Water Eaton Road, Summertown OX2
bedrooms: 2
bathrooms: 1
It can be a problem directly in the extracted data, or in the visualization
The text was updated successfully, but these errors were encountered: