Refactor data structure representing account candidates for cleaning #2296

dmakarov · 2024-07-26T01:58:17Z

Problem

AccountsDB::clean_accounts makes unnecessary copies of large number of
pubkeys and accompanying information to find and operate on the
accounts that can be deleted from the accounts index.

Summary of Changes

With this change the candidates for deletion are organized in a single data structure
with necessary information being updated in-place, thus reducing
memory requirements of the cleaning procedure.

Fixes #

accounts-db/src/accounts_db.rs

brooksprumo · 2024-07-26T20:05:12Z

accounts-db/src/accounts_db.rs

+                        store.accounts.scan_pubkeys(|key| {
+                            let index = self.accounts_index.bin_calculator.bin_from_pubkey(key);
+                            let mut pubkeys_bin = pubkeys[index].write().unwrap();
+                            pubkeys_bin.insert(*key, CleaningInfo::default());


We could also create per-thread binned pubkeys and then reduce that at the end. It may reduce lock contention.

I think what we have here is likely good as a v0. The map-reduce suggestion could be done in a subsequent PR.

brooksprumo · 2024-07-26T20:07:17Z

accounts-db/src/accounts_db.rs

+                    let index = self.accounts_index.bin_calculator.bin_from_pubkey(key);
+                    let mut pubkeys_bin = pubkeys[index].write().unwrap();
+                    pubkeys_bin.insert(*key, CleaningInfo::default());


And the same idea here too.

accounts-db/src/accounts_db.rs

jeffwashington · 2024-07-26T20:42:31Z

accounts-db/src/accounts_db.rs

@@ -2742,86 +2755,95 @@ impl AccountsDb {
    /// 1. one of the pubkeys in the store has account info to a store whose store count is not going to zero
    /// 2. a pubkey we were planning to remove is not removing all stores that contain the account
    fn calc_delete_dependencies(
-        purges: &HashMap<Pubkey, (SlotList<AccountInfo>, RefCount)>,
+        purges: &Vec<RwLock<HashMap<Pubkey, CleaningInfo>>>,


we should drop the RwLock at some point when it is no longer necessary, probably:
let purges = purges.into_iter().map(|x| x.into_inner().unwrap()).collect::<Vec<_>>()
This simplifies fns like this one.

we could also look for places where we are not executing in parallel where perhaps we should be. and maybe the rwlocks help us, but probably not.

Maybe if we separate (key, value) into separate vectors as Haoran suggested we don't need RwLock at all.

we should drop the RwLock at some point when it is no longer necessary, probably: let purges = purges.into_iter().map(|x| x.into_inner().unwrap()).collect::<Vec<_>>() This simplifies fns like this one.

I think it only works like this

purges.into_iter().map(|x| { let y = x.read().unwrap(); y.clone() }).collect::<Vec<_>>()

because x is a shared reference and RwLock::into_inner cannot move out of a shared reference.

this works and is reasonable (doesn't copy the hashmaps):

let purges = candidates.into_iter().map(|x| std::mem::take(&mut *x.write().unwrap())).collect::<Vec<_>>();

I tried Box::into_inner(candidates), but it complained about not knowing the size at compile time. I don't have a lot of experience with Box. I'm only saying that at some point we can get rid of the RwLock in the candidates. They were convenient early but no downstream code uses them at the moment. It only adds complexity and the possibility of deadlocks and poisoned rwlocks...

This isn't necessary for this pr, I don't think. And, I think the rwlocks could enable better threading/higher performance in the future.

fwiw, the separating keys/values is an interesting idea. I'd have to think more. I think what we have is fine for now.

accounts-db/src/accounts_db.rs

jeffwashington · 2024-07-26T20:48:17Z

looks like the right direction to me! I see my comments overlapped brooks. I knew he would have a lot to say.

dmakarov · 2024-07-26T20:50:58Z

I really appreciate the feedback, and will make the necessary changes. Thank you!

jeffwashington · 2024-08-06T19:01:06Z

fyi, I pulled this into the editor and worked through it. There was 1 more place we needed to filter. Then, there were several places with code duplication and several places where cargo fmt is apparently not correct. I made all these changes in individual commits and pushed them.

dmakarov · 2024-08-06T21:35:19Z

What should I do with this PR? Is more work required within this PR or other issues can be addressed in follow-up work?

accounts-db/src/accounts_db.rs

brooksprumo · 2024-08-06T21:38:05Z

accounts-db/src/accounts_db.rs

+                if slot_list.is_empty() {
+                    continue; // seems simpler than filtering. `candidates` contains all the pubkeys we original started with
                }


Is this a correctness or a performance change?

correctness. we previously would have not gotten in this loop for any pubkey that was not a zero lamport account. I don't know what happens if we have no slot list here. I don't want to find out.

but please verify my assertion!

Yeah. Nice catch!

jeffwashington · 2024-08-06T21:45:40Z

What should I do with this PR? Is more work required within this PR or other issues can be addressed in follow-up work?

once we test it once more, I think this pr is good. and once brooks/haoran review (and you look at it again). Bugs here can be very tricky, so this is a great exercise in really digging into this code.

brooksprumo · 2024-08-06T21:53:10Z

I'm concerned we're changing behavior in addition to the refactoring/reshaping of the data types. If possible, I'd find it preferable to have separate PRs for changing behavior. Maybe these changes are due to the refactoring/reshaping. I'll do another more substantive review.

accounts-db/src/accounts_db.rs

HaoranYi

I went over the PR again. It looks good to me.
Just one nit for clean up, which can be done either in this PR or in a follow-up PR.

jeffwashington · 2024-08-07T14:37:21Z

accounts-db/src/accounts_db.rs

@@ -3130,10 +3129,6 @@ impl AccountsDb {

        timings.delta_key_count = Self::count_pubkeys(&candidates);

-        let mut hashset_to_vec = Measure::start("flat_map");
-        hashset_to_vec.stop();
-        timings.hashset_to_vec_us += hashset_to_vec.as_us();


ha. it was never dumped out anyway.

jeffwashington · 2024-08-07T15:12:19Z

This morning, i started 2 mnb and 1 testnet validator with this change rebased to master.

accounts-db/src/accounts_db.rs

dmakarov · 2024-08-07T21:59:40Z

I tried to address all recent comments.

brooksprumo

Thank you for answering all my questions and addressing all my comments!

jeffwashington

lgtm. Thanks for working through all the issues. We want future debugging to be easier, not harder. I think we've all learned new things about clean while also making it use less memory and run faster.
I pulled down the full change again and verified every line.
I have 2 mnb and 1 testnet validator running on this. All are running fine. We verified this on a ledger that was ooming at startup. No longer ooming.

dmakarov force-pushed the oom branch 4 times, most recently from 9f4a90e to 7c5cd45 Compare July 26, 2024 12:47

dmakarov changed the title ~~Fix OOM~~ Refactor data structure representing account candidates for cleaning Jul 26, 2024

dmakarov force-pushed the oom branch 9 times, most recently from 8a14193 to 77b9935 Compare July 26, 2024 19:42

dmakarov requested review from brooksprumo, jeffwashington and HaoranYi July 26, 2024 19:43

jeffwashington reviewed Jul 26, 2024

View reviewed changes

accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved

jeffwashington reviewed Jul 26, 2024

View reviewed changes

accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved

brooksprumo reviewed Jul 26, 2024

View reviewed changes

jeffwashington reviewed Jul 26, 2024

View reviewed changes

accounts-db/src/accounts_db.rs Show resolved Hide resolved