-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Globus integration: improve handling of downloads #11057
Comments
adding extra things to add/improve...
|
status is "ACTIVE", take it seriously/literally. #11057
Having had another collection of NESE-stored/Globus-accessible datasets published at IQSS, with the combination of a large volume of data, large numbers of files per dataset AND the apparent popularity of the data with actual users, some limitations of the Globus download framework have become apparent.
(starting a list of issues that need to be addressed below; work in progress)
The same thing that was addressed in Improved handling of Globus uploads (experimental async framework) #10781 (merged in Sept.), is clearly a problem with downloads as well. Even when everything is working as it should, the reliance on continuous looping for the duration of the remote transfer in the current implementation is bound to cause problems. So, the same async. framework where the state of an ongoing upload is saved in the database needs to be implemented for downloads as well.
When problems arise (such as network problems; or the issues with the Globus service on the data storage end, as the case may have been last week), there are assorted problems with how Dataverse handles such. A simplest of examples, when Globus Service gets back
"status": "ACTIVE", … "nice_status": "CONNECT_FAILED"
when checking on an ongoing download task, it assumes that it has failed beyond recovery and proceeds to remove the permission. (All it means is that there was a failure to connect to the remote Globus service, but "ACTIVE" means just that - the remote client will keep trying to reconnect; which is bound to keep failing with the permission permanently removed, even if the service becomes available).The text was updated successfully, but these errors were encountered: