Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElementR | backup: call expensive roomKeyCounts less often #4015

Merged
merged 16 commits into from
Jan 22, 2024

Conversation

BillCarsonFr
Copy link
Member

@BillCarsonFr BillCarsonFr commented Jan 18, 2024

Fixes element-hq/element-web#26893

Avoid calling the expensive room count at each backup iteration

Checklist

  • Tests written for new code (and old code if feasible)
  • Linter and other CI checks pass
  • Sign-off given on the changes (see CONTRIBUTING.md)

Here's what your changelog entry will look like:

🐛 Bug Fixes

numFailures = 0;
if (this.stopped) break;
// Keys count performance (`olmMachine.roomKeyCounts()`) can be pretty bad on some configurations (big database in FF).
// We detected on some M1 mac that when the object store reach a threshold, the count performances stops growing in O(n) and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be useful to note what that threshold roughly is, for future readers

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added here ac383aa

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions

src/rust-crypto/backup.ts Outdated Show resolved Hide resolved
return;
}

try {
await this.outgoingRequestProcessor.makeOutgoingRequest(request);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit confusing that the call to roomKeyCounts now happens before the call to makeOutgoingRequest. I feel like it would be simpler the other way around:

                    await this.outgoingRequestProcessor.makeOutgoingRequest(request);
                    numFailures = 0;
                    if (this.stopped) break;

                    // ... comments about performance ...
                    if (!isFirstIteration) {
                        if(remainingToUploadCount === null) {
                            const keyCount = await this.olmMachine.roomKeyCounts();
                            remainingToUploadCount = keyCount.total - keyCount.backedUp;
                        } else {
                            const keysCountInBatch = this.keysCountInBatch(request);
                            remainingToUploadCount = Math.max(remainingToUploadCount - keysCountInBatch, 0);
                        }
                        this.emit(CryptoEvent.KeyBackupSessionsRemaining, remainingToUploadCount);
                    }

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, done here 3e7fa56

Comment on lines 434 to 436
for (const entry of Object.entries(parsedBody.rooms)) {
count += Object.keys(entry[1].sessions).length;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (const entry of Object.entries(parsedBody.rooms)) {
count += Object.keys(entry[1].sessions).length;
}
for (const entry of Object.values(parsedBody.rooms)) {
count += Object.keys(entry.sessions).length;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here 418074a

src/rust-crypto/backup.ts Outdated Show resolved Hide resolved
if (this.stopped) break;
if (remainingToUploadCount !== null) {
const keysCountInBatch = this.keysCountInBatch(request);
remainingToUploadCount = Math.max(remainingToUploadCount - keysCountInBatch, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while the backup loop runs, new megolm sessions could arrive (especially if you're backing up 400K sessions or so), so presumably memoizing the count like this could mean remainingToUploadCount gets stuck at 0 for quite a while at the end (while it backs up the additional keys that arrived). Worst case, the user might think that the progress has got stuck at zero and cancel it somehow. I wonder if it's worth spelling this edge case out at least as a comment, that the KeyBackupSessionsRemaining event may now lie and keep emitting 0 for a bit at the end of the backup loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, worth explaining. Added comment here 8dc1ff7

Co-authored-by: Richard van der Hoff <[email protected]>
src/rust-crypto/backup.ts Outdated Show resolved Hide resolved
import { defer } from "../../../src/utils";
import { RustBackupManager } from "../../../src/rust-crypto/backup";

describe("PerSessionKeyBackupDownloader", () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this... doesn't seem to be testing PerSessionKeyBackupDownloader ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd suggest renaming this file to backup.spec.ts. It's a good idea to have the test files match the files that they are testing, otherwise we end up with tests split between two files, which is very confusing.

Comment on lines 73 to 76
afterEach(() => {
fetchMock.reset();
jest.useRealTimers();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend a call to jest.restoreAllMocks here, to make sure that tests don't depend on mocks set up in other tests.

await rustBackupManager.checkKeyBackupAndEnable(false);
await jest.runAllTimersAsync();

await lastKeysCalled.promise;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need this await? Don't think it adds anything except complexity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, was usefull before I started to use Faketimers, removed

spec/unit/rust-crypto/BackupKeyLoop.spec.ts Outdated Show resolved Hide resolved
expect(remainingEmitted[5]).toEqual(0);
});

it("Should not call expensive roomKeyCounts for small chunks", async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the size of the chunk, but the fact there is only one chunk, right?

});
});

// We want several batch of keys to check that we don't call expensive room key count several times
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but there's only one batch here?

BillCarsonFr and others added 2 commits January 22, 2024 14:22
Co-authored-by: Richard van der Hoff <[email protected]>
Co-authored-by: Richard van der Hoff <[email protected]>
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

import { RustBackupManager } from "../../../src/rust-crypto/backup";

describe("PerSessionKeyBackupDownloader", () => {
describe("Import keys from backup", () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it exporting keys to backup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, look like I was confused by my recent work on backup import

@BillCarsonFr BillCarsonFr added this pull request to the merge queue Jan 22, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 22, 2024
@BillCarsonFr BillCarsonFr added this pull request to the merge queue Jan 22, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 22, 2024
@richvdh
Copy link
Member

richvdh commented Jan 22, 2024

@BillCarsonFr if this is failing due to flaky playwright tests please make sure that there are issues open to track the flaky tests.

@BillCarsonFr BillCarsonFr added this pull request to the merge queue Jan 22, 2024
Merged via the queue into develop with commit c993785 Jan 22, 2024
23 checks passed
@BillCarsonFr BillCarsonFr deleted the valere/element-r/backup_cache_keys_count_calls branch January 22, 2024 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants