Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: backup and restore progress updates in a more meaningful way. #3124

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

kl3jvi
Copy link
Contributor

@kl3jvi kl3jvi commented Nov 23, 2024


PR Submission Checklist for internal contributors

  • The PR Title

    • conforms to the style of semantic commits messages¹ supported in Wire's Github Workflow²
    • contains a reference JIRA issue number like SQPIT-764
    • answers the question: If merged, this PR will: ... ³
  • The PR Description

    • is free of optional paragraphs and you have filled the relevant parts to the best of your ability

What's new in this PR?

Issues

Progress not updating truthfully

Causes (Optional)

The current implementation was a fake progress that counted with a fixed delay even if we had a small file generated.
I tested with 1GB file compression and it is updating gracefully.

Solutions

Implemented a progress tracking system for file compression, which calculates the overall progress by combining the uncompressed and compressed data written. The progress is updated based on both the total bytes of uncompressed files and the size of the compressed output, with a weighted emphasis on the compressed data.

Dependencies (Optional)

If there are some other pull requests related to this one (e.g. new releases of frameworks), specify them here.

Needs releases with:

Testing

Test Coverage (Optional)

  • I have added automated test to this contribution

How to Test

Briefly describe how this change was tested and if applicable the exact steps taken to verify that it works as expected.

Notes (Optional)

Specify here any other facts that you think are important for this issue.

Attachments (Optional)

Attachments like images, videos, etc. (drag and drop in the text box)


PR Post Submission Checklist for internal contributors (Optional)

  • Wire's Github Workflow has automatically linked the PR to a JIRA issue

PR Post Merge Checklist for internal contributors

  • If any soft of configuration variable was introduced by this PR, it has been added to the relevant documents and the CI jobs have been updated.

References
  1. https://sparkbox.com/foundry/semantic_commit_messages
  2. https://github.com/wireapp/.github#usage
  3. E.g. feat(conversation-list): Sort conversations by most emojis in the title #SQPIT-764.

@MohamadJaara
Copy link
Member

@kl3jvi can you please fill the template with a description of how it supposed to work now?

Copy link
Member

@vitorhugods vitorhugods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the initiative!
We are in progress of refactoring the way we perform/restore backups, and while the progress indication isn't our top priority, it is definitely something we could easily improve :)

I just have a few concerns:

  1. The backpressure and cancellation issues I mentioned, so I think a Flow would be more flexible.
  2. You mentioned "backup and restore", but I don't see changes in the Restore part. This is fine, we can just rename the PR title/commit so the diffs match what's actually developed.
  3. Maybe there is another significant slow step that is being ignored during exportToPlainDB

I think progress could at least be split into three steps:

  • Exporting/Importing
  • Zipping/Unzipping (covered by this PR)
  • Encrypting/Decrypting (covered by this PR)

And, ideally, we should have a progress for each of these steps.

Our current exportToPlainDB is actually a single SQL query, so we are unable to track progress. However, the next version of the Backup format will be done in chunks, which should make it possible.

})
} finally {
databaseExporter.deleteBackupDBFile()
override suspend operator fun invoke(password: String, onProgress: (Float) -> Unit): CreateBackupResult =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a callback works, but it could be a bit safer to instead return a Flow that emits progress.

A Flow can be configured to handle back pressure more gracefully. For example, what if the backup creation calls onProgress faster than it can be executed?

I think it would be nicer to just return a Flow<BackupProgress> instead.
And BackupProgress could be something like:

sealed interface BackupProgress {
    data class Complete(val result: CreateBackupResult): BackupProgress
    data class Ongoing(val progress: Float): BackupProgress
}

Another argument for Flow:
if we change the implementation of CreateBackupUseCase and it runs on a different scope, this different scope would keep a reference to onProgress, which could lead to a leak.

It's easier to just return a flow and if it is cancelled, it is cancelled :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial implementation, was done using a flow. I need to ask about the iOS implementation though. The return of the actual function is Either, so changing that would require to do the same functionality in iOS.

I know that is the best approach and for sure will try to implement it.

Thanks for the input man

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iOS and Web do not use the code in :kalium:logic and they're a long way from this at the moment. They have their own code, their own databases, etc.

We are creating a Kotlin Multiplatform library for Backup that will live inside Kalium and it should be the first one.

It is gonna be a new module called backup, which we will publish as a stand-alone library fro Web and iOS.
It will not be responsible for exporting the data, but only to create the file, zip, encrypt, and reverse these operations.

Web, iOS and :kalium:logic will call this library with the data they want to backup. And call it with the file to get the restored data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, then i will do the refactoring of this using flow 👍🏻

deletePreviousBackupFiles(backupFilePath)

val plainDBPath =
databaseExporter.exportToPlainDB(securityHelper.userDBOrSecretNull(userId))?.toPath()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Doesn't this exportToPlainDB also consume a long time?

I really don't know 😅.
This is the step that literally takes all the relevant content in the user's DB and exports it. Which can be a ton of data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only test with our chats, as I don't have any big data to test tho.
Is there any way I can spam and create some random conversations through scripting or api calls?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants