Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide torrent creation web API #19498

Closed
wants to merge 2 commits into from
Closed

Conversation

rcarpa
Copy link
Contributor

@rcarpa rcarpa commented Aug 23, 2023

Implement a new api controller to handle generation of .torrent files.
Add a new TorrentCreationManager singleton which keeps tracks of
torrent creation tasks. Use a separate ThreadPool in the Manager.
It acts as a queueing mechanism for creation tasks to avoid too many
tasks running in parallel.

Slightly adapt the TorrentCreator for the new needs: allow to return
the content of the generated .torrent file without requiring to write
it to disk at all. If no savePath is passed, the created torrent file
will be kept in memory. Also, communicate the actual pieceSize used for
generation (if it's set to 0, libtorrent selects it automatically).

Closes #5614.

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 23, 2023

Hi team. I would like to know your opinion about this change. I'm aware that the commit is not of a mergeable quality yet, but I would like to get opinions before spending too much time on it. Unfortunately, my C++ and Qt knowledge is quite rudimentary.

@glassez
Copy link
Member

glassez commented Aug 23, 2023

@rcarpa
I would approve it if it is brought to a suitable state.

I would strongly recommend that you divide the work into several atomic parts (within separate PRs):

  1. changing the torrent creator,
  2. adding a "create torrent" API.

The main claims to the currently suggested API (which caught the eye with a quick review):

  1. Separate action = separate API method,
  2. It is better to add a separate controller for this task. You can take a Search controller for example, because it has similar behavior: creating an asynchronous task, getting the status of some of the running tasks, etc.

You shouldn't touch any of .ts files.

@glassez
Copy link
Member

glassez commented Aug 23, 2023

Also please read https://github.com/qbittorrent/qBittorrent/blob/master/CODING_GUIDELINES.me and try to follow it as accurately as possible.

@luzpaz

This comment was marked as resolved.

@glassez

This comment was marked as resolved.

@rcarpa

This comment was marked as resolved.

@glassez

This comment was marked as resolved.

@luzpaz

This comment was marked as resolved.

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 24, 2023

I've reworked the code to create a new controller for the task. I'm not sure about the name metafile though. As requested, I also split the change into 2 separate commits: one which reworks the TorrentCreatorThread into TorrentCreator. The second adds the new end-points. I tried to follow the codding guidelines, but I don't use QtCreator, so i had to do it manually. Hope that not too many things slipped through my eyes and fingers.
As stated previously, I don't have a lot of experience with C++, so please be careful reviewing. I may be doing something very sub-optimal without even realizing.

@glassez
Copy link
Member

glassez commented Aug 24, 2023

As requested, I also split the change into 2 separate commits

It was requested to provide separate Pull Requests in order to simplify reviewing. When the one containing torrent creator changes is approved and merged we can start reviewing the next one that provides WebAPI.

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 24, 2023

Sorry, I misunderstood that. Please refer to #19500

@@ -204,14 +205,22 @@ void TorrentCreator::run()
entry["info"]["source"] = m_params.source.toStdString();

checkInterruptionRequested();
if (!m_params.savePath.isEmpty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended to allow empty save path?

Copy link
Contributor Author

@rcarpa rcarpa Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, whoever creates a torrent via the API doesn't necessarily want the torrent on the disk. This allows the workflow "create a torrent" -> "retrieve the torrent via api" -> "push the torrent via api some other servers" -> "let the transfer happen between all servers". In this use-case, temporary .torrent files are just an additional un-needed hassle to manage.

The current PR came from my desire to implement Bittorrent as an additional wire synchronization protocol between storage elements in rucio

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reality, the workflow will be more complicated. The workflow which I described will only used for files for which we don't (yet) know the bittorrent v2 merkle root and piece layers. But this avoids having to implement a custom agent to compute this data and communicate it to Rucio if the bittorrent client already runs on the storage server anyway. I proposed a related change to deluge too (deluge-torrent/deluge#430)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe it isn't needed to create torrent file if client need the data?
I would omit additional parameter and use only savePath to determine whether file or data to produce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, if the client will try to fetch the file via the getFile action, it will open the file on disk? And fail if the file was removed ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to use create/status/result(singular)/delete

👍

It feels somehow strange to have a "start" and "stop" action for torrent creation.

I have never insisted on exact copying. The main thing is to catch the essence.

I added a separate TorrentCreationManager singleton.

👎
Well, I don't even want to look there. I believe in advance that this is unnecessary. (I don't want to sound arrogant, but my experience says so.)
Why do you need yet another manager? Isn't the controller itself enough for you to have "managing" logic there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 this was a compete miss-understanding then, I thought that's what you were telling me to do in answer to

As for torrent creation jobs. You are correct. The jobs are bound to an api/session and they should get destroyed when the api/session is closed, but by default (I believe so?) bittorent keeps the session indefinitely. Do you have a suggested solution? I was thinking initially to keep track of jobs in the base/bittorrent/session singleton. But I'm not sure it's a good idea. Any suggestions?

When you said to check the SearchController. By storing the torrent creation tasks in a singleton rather than inside the controller, they are not bound to a web session anymore. Having them managed by a web session doesn't allow to enforce any limits on the number of total torrent creation tasks allowed globally in parallel: the user can always bypass the limit by logging in parallel to a new Web Session and submitting new tasks under the new session.

That being said: I don't have any strong preferences for any of the two versions:

  • I can rollback to the previous version, where everything was handled inside the controller, but had the limitations of impossibility to enforce limits + the limitation that creation tasks where not visible across 2 different WebSession (if we ever add the functionality to the web ui, the users will be surprised if the torrent creation jobs disappear after a disconnect).
  • I can keep the new version with the Manager which has some advantages, but introduces a new singleton. This seems to be non-desired according to your last comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcarpa
As I said before I still didn't deal with it carefully so I can miss some important details.
Could you first provide some general description of how you think it should be designed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassez , for my own requirements, I just need a way to create a torrent file via the api. I really don't have any preferences concerning technical aspects of the implementation. I try to do my best to implement things in a way similar to what already exists and to follow your pointers to achieve a better-quality result, to fulfill the needs of the broader community of this project. However, it's my first contribution to the project and I don't have near enough high-level perspective to design it in a way which fits the long term vision of the project. It will be very useful if somebody with a long experience in the project will give it a thought and will tell what's desired and what's not desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcarpa
Hmm... it seems that there are some aspects of this case that I didn't realize at first. I need to think about it. Then I will try to summarize some of my judgments so that we can continue the discussion more productively.

@rcarpa rcarpa changed the title enable torrent creation via the api. #5614 Enable torrent creation via the api Aug 25, 2023
@glassez glassez changed the title Enable torrent creation via the api Provide torrent creation web API Aug 26, 2023
@glassez
Copy link
Member

glassez commented Aug 26, 2023

IMO, the main problem is that the handling of torrent creation tasks does not fit with "regular" use case of qBittorrent Web API, especially because of the fact that creating a torrent is a time-consuming process with an indefinite duration, which does not go well with the session-based model we use to access API. It may have several solutions, and it looks like they are all imperfect (maybe someone else will offer a better solution?).

  1. Limit it to a session.
    1.1. Make the user responsible for keeping the session until the end of the torrent creation process (for example, by regularly requesting the creation job status).
    1.2. Automatically keep the session until the end of the torrent creation process.

However, we should still have a global limit on the number of threads running simultaneously.

  1. Implement a global (application wide) torrent creation manager. Then no one will be required to keep a web session from the beginning to the end of the torrent creation process. You can close the web session after starting torrent creation and open new one in order to check its status or obtain the result.
    Its disadvantage is that the lifetime of the torrent creation task is not limited by anything other than the lifetime of the application, so if the client side forgets to delete some completed tasks, they will continue to hang until the application exits. Perhaps this could be improved a little by adding a limit on the number of existing tasks, so that if it is exceeded, creating a new one would delete the old one.

Both options above are not suitable for the use case when the user just wants to start creating a torrent, but does not control its process and result, at least by means of the API itself. (I don't know how possible this use case is.) However, the second option can be expanded with a parameter specifying that the "task" should be automatically deleted after completion.

What do you think?
@Chocobo1?

P.S. Could someone tell about any other aspects that need to be taken into account when solving this problem?

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 28, 2023

I just rebased the PR. No other changes.

Concerning your analysis, to me it sounds like (1) is the version before the last push; before introducing the Manager. While (2) is the current version. As I have both versions almost finalized, I don't have any particular preferences. It's your's and @Chocobo1 call to say what's better.
If my opinion is desired, I don't see having the lifetime of the torrent creation task unlimited as a problem in itself. It's not like they are created automatically in background. If the user chooses to never explicitly cleanup the old jobs and this eats all his memory: how is this different from any other user choice which can overload his resources? Also: there is always the option to implement a configuration flag and additional protection if really desired.
For non-power-users, which only use the WebUi, additional protection can be added on the UI side if ever this API is exposed on the WebUI.

@Chocobo1
Copy link
Member

Note that I didn't have enough free time to investigate it thoroughly, just some basic ideas.

  1. Implement a global (application wide) torrent creation manager.

I would chose this.

Both options above are not suitable for the use case when the user just wants to start creating a torrent, but does not control its process and result, at least by means of the API itself.

I'm not sure if 'task' concept is still required after the torrent creation ends. I would imagine if a user wants to check the results he can just look at the log (assuming we logged the result). The 'task' is still useful when the torrent creation is running which let the user to cancel/stop the operation.

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 29, 2023

I'm not sure if 'task' concept is still required after the torrent creation ends

How would a user retrieve the created .torrent file ?

Edit: I assume you suggest to directly add all created torrents to the session? While this is a possibility, I'm, for example, interested in the use-case when a torrent file is created without adding it to the session.

@glassez
Copy link
Member

glassez commented Aug 29, 2023

I'm, for example, interested in the use-case when a torrent file is created without adding it to the session.

👍
I don't think we shouldn't provide such an opportunity.

I would imagine if a user wants to check the results he can just look at the log (assuming we logged the result).

At first glance, this is not so trivial for an pure WebAPI (not WebUI) user.

@glassez
Copy link
Member

glassez commented Aug 29, 2023

Note that I didn't have enough free time to investigate it thoroughly, just some basic ideas.

Personally, I am interested in general questions. I could help with the details myself.

@Chocobo1
Copy link
Member

How would a user retrieve the created .torrent file ?

Currently there is no mechanism to send generic files or at least it wouldn't be a positive user experience with the current architecture. So for now I won't consider it.

Generally speaking, if a web user only wants to create a .torrent file, there are existing browser side implementation: https://kimbatt.github.io/torrent-creator/ and I don't mind seeing it embedded into qbt.
As for the story of server-side torrent creation, I reckon it is mainly for users that already have the data on the server and wish to seed it. Of course the user can choose not to add it to session after creation but it also acceptable for me to let the user to find the created .torrent file on server/target path manually without providing an explicit 'creation result'.

@glassez
Copy link
Member

glassez commented Aug 29, 2023

As for the story of server-side torrent creation, I reckon it is mainly for users that already have the data on the server and wish to seed it.

I'm starting to lean towards the same opinion. Indeed, what could be the purpose of creating a torrent file for existing data if they are not going to seed it?
Really, if the user only wants to create a torrent file, then he can use some existing third-party tool.

@glassez
Copy link
Member

glassez commented Aug 29, 2023

@Chocobo1
But even if we allow only such a use case (creating a torrent with adding it to the session), how could the user check the result of the operation, other than parsing the log? (this refers more to an unsuccessful creation)

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 29, 2023

I have an use-case when the ability to create the torrent on the remote server without immediately adding it to the session will come in handy. This is the use-case which triggered my work on this particular PR.

I'm a developer for a data-management application(rucio. It its essence, it's a catalog of many large files. These files are distributed on hundreds of storage servers around the globe. Files have to be regularly moved between storage servers. For example, to ensure redundancy; or data locality: move multiple related files close to a compute cluster to perform a data analysis on these files. Files are generated once, but moved many times. They are frequently quite big: 1GB+.

I'm willing to add BitTorrent support as one of possible ways to execute the transfers between (a subset of) servers. The idea is to have a BitTorrent agent (ie: qBittorrent) run on each server which desires to use this protocol for data transfer. I'm restricting myself to only using bittorent v2. Files are transferred frequently, so generation of the merkle sha256 tree and piece layers (.torrent file content) of each file, each time, is counter productive. This information will thus be stored in our metadata catalog for each file and re-used for each transfer. However, when a new file is encountered, this information has to be computed. It has to be done on the storage server (as the only place which actually has the file). And I don't necessarily want to start seeding the file immediately. I may want to:

  • create a torrent with multiple files before adding to the session (by embedding into it additional files, for which I already have a pre-computed merkle sha256 tree)
  • add a tls certificate to the torrent file (https://libtorrent.org/manual-ref.html#ssl-torrents) before adding it to the session

I agree that it's possible to use a third-party tool for that. But this means having to add an additional tool. And setting up one more communication channel with the storage server to retrieve the output of that tool. At the same time, qBittorrent is already running on the server and is equipped with everything needed to build the torrent and allow me to fetch it.

That being said, I don't want to push my use-case onto the whole qBittorrent community if it's judged way too specific to serve anybody's else's needs. But it gives one possible answer to :

what could be the purpose of creating a torrent file for existing data if they are not going to seed it?

@glassez
Copy link
Member

glassez commented Aug 29, 2023

@Chocobo1 But even if we allow only such a use case (creating a torrent with adding it to the session), how could the user check the result of the operation, other than parsing the log? (this refers more to an unsuccessful creation)

So, considering that we can make "torrent creation task" objects quite lightweight, we can still store a certain number of completed "tasks", having limited them with something like a circular buffer.

@glassez
Copy link
Member

glassez commented Aug 29, 2023

How would a user retrieve the created .torrent file ?

Currently there is no mechanism to send generic files or at least it wouldn't be a positive user experience with the current architecture. So for now I won't consider it.

So we could not make the function of receiving files to be general purpose, but limit it only to the possibility of receiving files created within some of "torrent creation task".

@rcarpa
Copy link
Contributor Author

rcarpa commented Aug 29, 2023

Actually, there is already a function which does a similar action . It fetches the torrent from the session and exports it. (I plan to use this work-around if my try to support creation without adding to session is not accepted 😁 )

void TorrentsController::exportAction()

How will an action to fetch the result of a creation task be different?

@rcarpa rcarpa force-pushed the master branch 2 times, most recently from 7778669 to b906ba5 Compare September 11, 2023 16:42
@rcarpa
Copy link
Contributor Author

rcarpa commented Sep 11, 2023

I implemented a limit on the maximum number of tasks. For that, I relied on boost::multi_index as a container. I hope this isn't problematic. I saw that boost is already used by the project, but I don't know how OK it is to use all its functionalities. Thanks to it, tasks can be accessed by two indexes: either by the task id; or by the completion status and date. This avoids the need for maintaining two separate data structures and synchronize state between them.

I slightly deviated from the agreement above. Instead of limiting the number of completed tasks, I limited the total number of tasks (including incomplete). However, when the limit is reached, but a new task is submitted, the oldest completed task will be automatically removed. If all tasks are incomplete, an error is returned to the user. This is because I wasn't able to find a way to implement a limit on completed tasks without keeping track of them in a separate data structure (or, alternatively, iterating over all tasks each time in O(n)). If you have better ideas on how to achieve the goal, I'd be happy to learn about them.

@luzpaz
Copy link
Contributor

luzpaz commented Oct 28, 2023

bumping for progress

@luzpaz
Copy link
Contributor

luzpaz commented Nov 5, 2023

Testers needed here

@nostrus-dominion
Copy link

I'll take a look at this when I get a chance but work has me buried up to my neck. Looks promising.

Copy link

github-actions bot commented Jan 6, 2024

This PR is stale because it has been 60 days with no activity. This PR will be automatically closed within 7 days if there is no further activity.

@github-actions github-actions bot added the Stale label Jan 6, 2024
@zotabee
Copy link

zotabee commented Jan 6, 2024

bumping

@github-actions github-actions bot removed the Stale label Jan 7, 2024
Copy link

@nostrus-dominion nostrus-dominion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not the best reader for C but everything here looks good for me.

@@ -0,0 +1,50 @@
/*
* Bittorrent Client using Qt and libtorrent.
* Copyright (C) 2018 Thomas Piccirello <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcarpa
Your copyright should be here. And in those files that you have changed, your copyright should be added to the existing ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed them in a new commit. Feel free to squash it into the original one when you'll be working on finalizing the PR.
I see however that these copyrights aren't quite kept up to date. Aren't they just duplicating information which is available in the git history anyway, while introducing risks of merge conflicts?

@glassez glassez self-assigned this Jan 24, 2024
@glassez glassez added Core WebAPI WebAPI-related issues/changes labels Jan 24, 2024
@glassez glassez marked this pull request as draft January 24, 2024 03:31
Implement a new api controller to handle generation of .torrent files.
Add a new TorrentCreationManager singleton which keeps tracks of
torrent creation tasks. Use a separate ThreadPool in the Manager.
It acts as a queueing mechanism for creation tasks to avoid too many
tasks running in parallel.

Slightly adapt the TorrentCreator for the new needs: allow to return
the content of the generated .torrent file without requiring to write
it to disk at all. If no savePath is passed, the created torrent file
will be kept in memory. Also, communicate the actual pieceSize used for
generation (if it's set to 0, libtorrent selects it automatically).

By default, the created torrents will be added to the session. This
behavior can be disabled by setting `startSeeding = false`.

The maximum number of tasks in the manager is bounded by a configuration
value. If this limit is reached, the manager will automatically remove
the oldest completed task. If no such task exists (all tasks are pending
or active), an error will be returned to the user.

Closes qbittorrent#5614.
@rcarpa rcarpa force-pushed the master branch 2 times, most recently from 69e15b8 to ee71824 Compare January 24, 2024 07:59
@glassez
Copy link
Member

glassez commented Feb 3, 2024

Superseded by #20366.

@glassez
Copy link
Member

glassez commented Feb 5, 2024

@rcarpa
Would you mind to test #20366?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core WebAPI WebAPI-related issues/changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support creating new torrents using the WebUI
7 participants