Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-volume cloning of storage vfolders #3432

Open
lizable opened this issue Jan 13, 2025 — with Lablup-Issue-Syncer · 0 comments
Open

Cross-volume cloning of storage vfolders #3432

lizable opened this issue Jan 13, 2025 — with Lablup-Issue-Syncer · 0 comments

Comments

@lizable
Copy link
Contributor

lizable commented Jan 13, 2025

Motivation  

Current Limitation

Cloning large vfolders (~hundreds GiB, TiB) across storage hosts (volumes) requires a special attention to resumability upon interrupts, as we frequently observe storage failures like inadvertent unmounts of storage volumes, etc. in high-load systems.

Also, it is impossible to utilize per-backend acceleration features like tree copies within a same filesystem, as the data must be transferred via networks.

Therefore, we currently do not support cross-volume vfolder cloning. This should be actively disabled/discouraged in the WebUI and Control Panel until we properly implement the feature.

Goals

  • Define an interface to prepare, perform, and inspect the cloning procedure & progress.
    • We should also add the UI design to display such long-running progress-reporting resumable background jobs.
  • Make it resumable.
  • Enable progress tracking (though it may be difficult to implement “perfect” progress).

Considerations / Approaches

The simplest implementation would be shutil.copy_tree() in the storage proxy, where the host has the local mounts of both storage volumes. However, this has severe limitations on resumability and progress tracking.

  • Idea: Incorporate rysnc to synchronize two local paths in the storage-proxy host, offloading our implementation effort for resuming and progress tracking. (ref: https://chatgpt.com/share/6784ad0b-3774-8000-8ace-50af8e0fb2a4)
    • Preferably, we could ship a statically built rsync binary together with the storage proxy.
@lizable lizable changed the title Need to implement clone on heterogeneous storage host Cross-volume cloning of storage vfolders Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant