You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the time being, we import all the targets into a MariaDB database.
From a technical viewpoint, switching to a documental database like MongoDB would hold many advantages.
Here's a list of them from the more important to the less (in my opinion):
In workflow.py we perform some joins to gather all the information for an entity in a target. With a documental database, it wouldn't be necessary.
In workflow.py 'extract_features', we already check if the columns are there. The same check would be done in a documental database.
We tried to find a common schema among all the data sources and we failed. Introducing a documental database would save a lot of space spent on null fields and short words.
A documental database would let us save strings of variable length, fixing all the errors in the import phase due to fields too small.
There would be more flexibility on adding data available uniquely on a single data source.
It would be necessary to be consistent with the ontology mapping while we import the data, but nothing new under the sun.
However, the only obstacle I see is from an infrastructure viewpoint.
I wasn't able to find anything about documental databases hosting on Wikitech. We should probably ask them. Maybe @marfox is aware of something.
The text was updated successfully, but these errors were encountered:
At the time being, we import all the targets into a MariaDB database.
From a technical viewpoint, switching to a documental database like MongoDB would hold many advantages.
Here's a list of them from the more important to the less (in my opinion):
It would be necessary to be consistent with the ontology mapping while we import the data, but nothing new under the sun.
However, the only obstacle I see is from an infrastructure viewpoint.
I wasn't able to find anything about documental databases hosting on Wikitech. We should probably ask them. Maybe @marfox is aware of something.
The text was updated successfully, but these errors were encountered: