Skip to content

Quirks and edge cases

zverok edited this page Apr 9, 2016 · 1 revision

The data Reality provides is only as good as data in Wikipedia/Wikidata and our parsers/processors for this data. So, don't try to use it for really precise scientific computations or really important business tasks.

Some examples of limitations and edge cases:

  • method #cities each country extracts data from Wikipedia page named "List of cities in %countryname%" -- for some countries it lists only top-20 cities, for others there are hunders, sometimes even all of them;
  • Wikipedia's concept of "continent" is vague:
Reality.continents.last
# => #<Reality::Entity(Australia (continent)):continent>
Reality.continents.last.countries
# => #<Reality::List[]>
# ????
# Let's see...
a = Reality::Entity('Australia')
# => #<Reality::Entity(Australia):country>
a.continent
# => #<Reality::Entity?(Oceania)>
# Hmmmm....
a.continent.countries
# => #<Reality::List[Australia?, Fiji?, Kiribati?, Marshall Islands?, Federated States of Micronesia?, Nauru?, New Zealand?, Palau?, Papua New Guinea?, Samoa?, Solomon Islands?, Tonga?, Tuvalu?, Vanuatu?]>

That's kinda weird thing of Wikipedia data ("Countries by continents" and "List of continents" pages seems to be vague of "continent" and "part of world" concepts).

  • Or, look at this one:
Reality::Entity('Spain').continent
# => #<Reality::Entity?(Africa)>
# ???

In fact, Spain (including some of it remote parts) spans two continents: Europe and Africa. And Wikidata lists them both under "continent" predicate. And reality for most of predicates just takes the first value... And here we are. Of course, there should be some smart workaround, but currently there is no.