-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify warehouse list #958
Comments
OK, just so we're clear it's going to be a difficult task because mirrors uses rsync and there is no such thing as renaming there. So if it's not properly coordinated (and we're talking about 12 different people) it could result in incredible transfers: deleting everything and re-downloading everything for instance. |
Could someone push a documentation or an explanation on what is the intent of these warehouse paths, so that we are all on the same page on this question before making any decision? |
A number of users including ourselves have always been using it to find and download ZIM files. It used to be this or the wiki. Now all readers (but kiwix-serve) have an included downloader and we have library.kiwix.org that offers download as well. I personally use it exclusively but have never been attached to the folders. |
What are the warehouse folders is arbitrary and to a large extend should not be that important (for end users). The problem here is that it is "confusing" for Zimfarm editors, and this is IMHO primarely a UI problem. We could choose almost automaticaly where to store the ZIM files based on the scraper and by choosing the "collection". The "collection" means basically: in which library the produced ZIM should appear. For now we have formally only one collection. But once this will properly handled in CMS we will have many of them. Still a bit unsure about how the separation of duties should exactly look like between the Zimfarm and the CMS. |
The current list of warehouse paths _sort of goes along the lines of the various scrapers we are using, but not really and this gets confusing particularly as it seems to give prominence to content for no obvious reason (e.g. vikidia, psiram). I also understand that besides a modicum of ordering available file this system may be used by various mirrors to pick and chose which content they want to mirror (in practice however only the Wikimedia Foundation restricts its mirroring to Wikimedia-related content).
I suggest simplifying the list of warehouses to be more congruent with our scrapers, ie:
(not discussing the /.hidden folders that have their own, clearly-defined purpose)
The naming is not 100% ideal as we need to force a distinction between WMF and non-WMF wikis but other than that it seems a move in the right direction.
The text was updated successfully, but these errors were encountered: