Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

purge Galaxy users and their data/histories/data libraries - recommended approach? #6603

Closed
AjitPS opened this issue Aug 15, 2018 · 6 comments

Comments

@AjitPS
Copy link

AjitPS commented Aug 15, 2018

As the Admin of the Galaxy instance at our institute, I am trying to free up storage space by removing users who have left the institute or no longer use Galaxy (i.e., with a last login cut-off time of 3 years).

I can go into the Admin panel and delete a user, then purge them. But that doesn't get rid of all the data (to free up disk space/ quota) used by them. (similar to #930)

After purging some users, I tried running the clean_up scripts (in the recommended order) but that doesn't seem to free up space either. https://galaxyproject.org/admin/config/performance/purge-histories-and-datasets/#deleting-datasets-purging-dataset-instances

What is the recommended way of going about this?, i.e., permanently removing a user account and their data/histories. Thanks.

@mvdbeek
Copy link
Member

mvdbeek commented Aug 15, 2018

So these instructions always require histories or HistoryDatasetAssociations to be deleted before any space is freed by deleting the actual dataset. I am not sure that deleting a users also deletes their histories, but that would be reasonable, I think.

@mvdbeek
Copy link
Member

mvdbeek commented Aug 15, 2018

So one way to do this is to mark the deleted users' history as deleted, and I will see if we can add an option for that to the cleanup scripts.

@AjitPS
Copy link
Author

AjitPS commented Aug 15, 2018

Thanks, it would be useful if an Admin deletes a user via the Admin Panel, and then also purges the account, for all history and datasets of that user to be marked deleted so the clean-up scripts can remove them.

@AjitPS
Copy link
Author

AjitPS commented Aug 15, 2018

What I have currently are 30+ deleted and purged user accounts (including some using 4-5 TB of disk space) on our institute's Galaxy instance, but their disk space is still shown in use.

If I now undelete them via the admin panel, I can impersonate them but find no data/history in their account.

However, the clean-up script doesn't free up any of the space that we expected, as I guess the user is marked deleted and purged but the history/dataset if left as is in /database/files/. I thought it would be a userless history and so that clean-up script would remove it but it doesn't seem so.

Any advice on how to proceed and free up space would be helpful. Thanks.

@AjitPS
Copy link
Author

AjitPS commented Aug 17, 2018

If for all purged users, their histories are set to be userless, then that running that clean-up script: scripts/cleanup_datasets/delete_userless_histories.sh would free up all their user disk space.

@hexylena
Copy link
Member

This will be fixed in #8309

I guess the current workaround would be to write some SQL to mark those user's histories as deleted, and then running the cleanup scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants