Create a super-seeder docker image/container #43

kelson42 · 2019-01-02T11:26:02Z

Looks like still not all BitTorrent clients can deal properly with our Web seeds. Having a complete and always running super-seeder would help to solve that problem. We could run it on a mirror (files already there). Additionally this Docker image might be interesting to a few Kiwix supporters who have that way a solution to support the project by easily sharing a bit of there bandwidth.

Using rsync, see https://download.kiwix.org/README, and rtorrent, that should not be too complicated.

nemobis · 2019-06-02T09:27:52Z

Do you think the docker image should assume a disk space in the TBs, to automatically seed everything, or something more conservative?

kelson42 · 2019-06-02T09:33:14Z

@nemobis The whole download.kiwix.org is around 10TB. It is difficult to assume that a seeder has so much space for this. What might be a solution to that problem is to be able to share (as a Docker environment variable) a list of path regular expressions to filter what he wants to seed from download.kiwix.org.

kelson42 · 2019-06-05T06:17:08Z

An old attemps can be found in this repo with the files:

kiwix_superseed.sh
.rtorrent.rc

These files should be removed at the end of the implementation

stale · 2019-08-04T06:51:00Z

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 · 2020-09-14T06:36:47Z

Here a base of work https://gitlab.com/adrienandrem/kiwix-torrent-watcher

rgaudin · 2020-09-14T07:38:52Z

I see this is very recent. What's the status of this? Is there any need beyond that?

stale · 2020-12-24T10:58:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 · 2022-02-19T16:19:00Z

I have been thinking about this ticket the last weeks and I think I know now how to do that the best way.

First of all, I plan to use a pre-existing Docker image https://github.com/linuxserver/docker-qbittorrent because:

It exists and is maintained
qBittorrent is a well know BitTorrent client
This is based on the qbittorrent-nox version, which is headless
qBittorrent proposed since many years an API which allows us to have a proper way to instrumentalise it
We can intrumentalize from outside of the container easily
It exists many client tools and library to deal with this API
I have verified it works and propose the options I believe we need

Considering we reuse linuxserver/qbitttorrent Docker image, we still have to have a solution to synchronise (add/remove torrents) with https://download.kiwix.org (or maybe even better https://library.kiwix.org?). I plan to do so:

Build a dedicated Docker image based on a simple bash script running in cron
Script will retrieve the list of ZIM to mirror in the superseeder based on the OPDS feed (so user can set a filter if needed)
Based on the feed data (parse with gron), script with require via API the qbittorrent client (via https://github.com/fedarovich/qbittorrent-cli/) to download new content
Content which are not in the feed anymore, will be deleted after a configurable delay.

stale · 2022-04-27T19:32:20Z

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

benoit74 · 2023-10-03T09:48:48Z

I like the idea of using linuxserver/qbitttorrent and its Docker image ; the project is very active and mature ; the web API is very usefull

I don't get if we want to :

download missing content locally to be able to serve it
take benefit of existing local content, meaning it has to be installed on a mirror

It looks like the initial idea was option 2, but we have switched to option 1, and I don't know why

I see lots of advantages in option 2 because:

we already have tooling to mirror files, I don't see the benefit of re-implementing it with bittorrent instead of rsync
it might save storage space + bandwidth for those who already have a mirror (and this is our case)
it will avoid complexities in our tooling (filtering what we want to download, deciding what has to be purged about which delay, ...)
it will avoid potential conflicts if installed on a place where a mirroring tool is already running (otherwise both the mirroring tool and the super seeder will need to have write access to the same location)
it will work even for hidden ZIM files / non-ZIM content if installed on download.kiwix.org

The drawbacks of option 2 are that:

we need to detect which files are available locally to add them to qbittorrent (I've checked, it is capable to handle already existing files)
we don't need to use the OPDS feed (I consider that qbittorrent will check file hash in any case before seeding it)
we need to detect which files have been removed to remove them from qbittorrent (but qbittorrent is probably already handling it a bit, it is quite common that users move downloaded files once the download is finished and usually - at least in Transmission - the client does not restart the resource download)

Looking at existing proposal, I can't comment much on that, it's a shell script and it's clearly not a language I can comment a lot. The overall logic is simple so it looks like it will work. I don't know how many subtleties we might discover once running in real conditions.

I wonder if we should instead write this additional tooling in Python, because:

qbittorrent CLI project is not maintained anymore while the Python library (https://github.com/rmartin16/qbittorrent-api) is maintained and very active (already supporting Python 3.12 and latest qbittorrent release)
it is easier (for me at least) to develop / test / maintain

rgaudin · 2023-10-03T10:03:24Z

I think what this ticket misses is an (up-to-date) description of what problem this should solve with user scenarios examples.
The discussion already highlights that the storage/selection/cleanup is core. It's important to clear how our need and kiwix-enthusiasts' ones align for instance.

kelson42 · 2023-10-14T17:28:30Z

I don't get if we want to :
1. download missing content locally to be able to serve it

2. take benefit of existing local content, meaning it has to be installed on a mirror
It looks like the initial idea was option 2, but we have switched to option 1, and I don't know why

We should be able to do both because:

Kiwix should provide a super-seeder
Anybody should easily be able to superseed all or parts of the https://download.kiwix.org/zim

I see lots of advantages in option 2 because:

* we already have tooling to mirror files, I don't see the benefit of re-implementing it with bittorrent instead of rsync

* it might save storage space + bandwidth for those who already have a mirror (and this is our case)

* it will avoid complexities in our tooling (filtering what we want to download, deciding what has to be purged about which delay, ...)

* it will avoid potential conflicts if installed on a place where a mirroring tool is already running (otherwise both the mirroring tool and the super seeder will need to have write access to the same location)

* it will work even for hidden ZIM files / non-ZIM content if installed on download.kiwix.org

Creating a HTTP mirror needs a lot more of infrastructure effort than creating a BitTorrent super-seeder. This is why both solutions don't really compete in the same field. The most obvious ones been to have a big and stable bandwidth and a fix IP.

The drawbacks of option 2 are that:

* we need to detect which files are available locally to add them to qbittorrent (I've checked, it is capable to handle already existing files)

Yes, this should be trivial. I'm ready to reconsider the requirements if not.

* we don't need to use the OPDS feed (I consider that qbittorrent will check file hash in any case before seeding it)

True, but that part is already implemented in the BitTorrent tracker, this is not really new work.

* we need to detect which files have been removed to remove them from qbittorrent (but qbittorrent is probably already handling it a bit, it is quite common that users move downloaded files once the download is finished and usually - at least in Transmission - the client does not restart the resource download)

True, wonder if this part is also not handled in the BitTorrent tracker!

Looking at existing proposal, I can't comment much on that, it's a shell script and it's clearly not a language I can comment a lot. The overall logic is simple so it looks like it will work. I don't know how many subtleties we might discover once running in real conditions.

If I remember correctly, I was almost over with the work and I was just lacking time. I don't remember having faced big challenges linked to subtilities.

I wonder if we should instead write this additional tooling in Python, because:

* qbittorrent CLI project is not maintained anymore while the Python library (https://github.com/rmartin16/qbittorrent-api) is maintained and very active (already supporting Python 3.12 and latest qbittorrent release)

* it is easier (for me at least) to develop / test / maintain

Nothing against this, should be fairly easy. I made it in Bourne shell because I didn't wanted to impose Perl as I can not write Python myself. Actually this is even probably a good idea.

kelson42 · 2024-10-27T09:49:49Z

For the following reasons I believe the effort of completing this PR woukd be really helpful:

We have already and recently invested significant amount of resources to improve quality of download speed
Audience grows and we have HTTP mirrors which struggle
Kiwix Desktop (in particular the downloader) has made a significant quality jump with version 2.4.0. We would be ready there to better support BitTorrent (both download/upload).
Still not all BitTorrent clients don't support Webseed properly

For all these reasons I believe we should now secure the super-seeder guaranties download via BitTorrent works as good as we could expect.

nemobis · 2024-10-27T10:56:55Z

Nice to see some movement. I'm happy to help test this but I'll need some suggestions on what files to seed (the last times I tried to seed Kiwix torrents I failed to reach any meaningful ratio).

What I'd personally really like is a ruTorrent/transmission/other plugin to handle the addition and removal of torrents. That would be easy to install on top of any existing installation method, be it a web UI or a docker image.

kelson42 · 2024-10-27T14:07:08Z

Nice to see some movement. I'm happy to help test this but I'll need some suggestions on what files to seed (the last times I tried to seed Kiwix torrents I failed to reach any meaningful ratio).

Since around two years, we have our own BitTorrent tracker. Therefore that for any "famous" ZIM file, you will find peers to share bits. Not even talking about the Web seeds.

This issue is only there to offer a guarante, to have - at least - one peer (to download from).

What I'd personally really like is a ruTorrent/transmission/other plugin to handle the addition and removal of torrents. That would be easy to install on top of any existing installation method, be it a web UI or a docker image.

To me this belongs to an other issue which is left to open. I was not even aware it was possible to create plugin for such a purpose to Transmission.

nemobis · 2024-10-27T15:52:30Z

Therefore that for any "famous" ZIM file, you will find peers to share bits. Not even talking about the Web seeds.

Ok. I've ever had trouble finding peers via DHT or the public trackers. I just never get anyone leeching these days, probably because the web seeds are so fast. So I have no idea what to seed.

To me this belongs to an other issue which is left to open.

Maybe. OTOH it's compatible with the idea you wrote above:

Script will retrieve the list of ZIM to mirror in the superseeder based on the OPDS feed

The "script" can be implemented as something that uses the transmission/rtorrent/other RPC.

kelson42 added the enhancement label Jan 2, 2019

stale bot added the stale label Aug 4, 2019

kelson42 self-assigned this Sep 12, 2020

stale bot removed the stale label Sep 12, 2020

stale bot added the stale label Dec 24, 2020

stale bot removed the stale label Feb 19, 2022

kelson42 linked a pull request Feb 20, 2022 that will close this issue

New superseeder #214

Draft

stale bot added the stale label Apr 27, 2022

rgaudin mentioned this issue Apr 20, 2023

Magnet links not working #242

Closed

stale bot removed the stale label Oct 3, 2023

kelson42 assigned benoit74 Oct 3, 2023

kelson42 assigned rgaudin Oct 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a super-seeder docker image/container #43

Create a super-seeder docker image/container #43

kelson42 commented Jan 2, 2019 •

edited

Loading

nemobis commented Jun 2, 2019

kelson42 commented Jun 2, 2019 •

edited

Loading

kelson42 commented Jun 5, 2019

stale bot commented Aug 4, 2019

kelson42 commented Sep 14, 2020

rgaudin commented Sep 14, 2020

stale bot commented Dec 24, 2020

kelson42 commented Feb 19, 2022

stale bot commented Apr 27, 2022

benoit74 commented Oct 3, 2023

rgaudin commented Oct 3, 2023

kelson42 commented Oct 14, 2023 •

edited

Loading

kelson42 commented Oct 27, 2024

nemobis commented Oct 27, 2024 •

edited

Loading

kelson42 commented Oct 27, 2024 •

edited

Loading

nemobis commented Oct 27, 2024

Create a super-seeder docker image/container #43

Create a super-seeder docker image/container #43

Comments

kelson42 commented Jan 2, 2019 • edited Loading

nemobis commented Jun 2, 2019

kelson42 commented Jun 2, 2019 • edited Loading

kelson42 commented Jun 5, 2019

stale bot commented Aug 4, 2019

kelson42 commented Sep 14, 2020

rgaudin commented Sep 14, 2020

stale bot commented Dec 24, 2020

kelson42 commented Feb 19, 2022

stale bot commented Apr 27, 2022

benoit74 commented Oct 3, 2023

rgaudin commented Oct 3, 2023

kelson42 commented Oct 14, 2023 • edited Loading

kelson42 commented Oct 27, 2024

nemobis commented Oct 27, 2024 • edited Loading

kelson42 commented Oct 27, 2024 • edited Loading

nemobis commented Oct 27, 2024

kelson42 commented Jan 2, 2019 •

edited

Loading

kelson42 commented Jun 2, 2019 •

edited

Loading

kelson42 commented Oct 14, 2023 •

edited

Loading

nemobis commented Oct 27, 2024 •

edited

Loading

kelson42 commented Oct 27, 2024 •

edited

Loading