Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop flat caches #3009

Closed
wants to merge 15 commits into from
Closed

Drop flat caches #3009

wants to merge 15 commits into from

Conversation

dapplion
Copy link
Contributor

@dapplion dapplion commented Aug 24, 2021

Motivation

CachedBeaconState duplicates much of the data in the state. Lodestar struggles with too much memory, so it's imperative data is represented only once.

Description

WIP

@codeclimate
Copy link

codeclimate bot commented Aug 24, 2021

Code Climate has analyzed commit 984c318 and detected 4 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 1
Duplication 3

View more on Code Climate.

@ChainSafe ChainSafe deleted a comment from lgtm-com bot Aug 25, 2021
@ChainSafe ChainSafe deleted a comment from lgtm-com bot Aug 25, 2021
@ChainSafe ChainSafe deleted a comment from lgtm-com bot Aug 25, 2021
@ChainSafe ChainSafe deleted a comment from lgtm-com bot Aug 25, 2021
@lgtm-com
Copy link

lgtm-com bot commented Aug 26, 2021

This pull request introduces 1 alert when merging b11e6e8 into 9fb1c42 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@github-actions
Copy link
Contributor

github-actions bot commented Aug 26, 2021

⚠️ Performance Alert ⚠️

Possible performance regression was detected for some benchmarks.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold.

Benchmark suite Current: c4c95f9 Previous: 241540f Ratio
getPubkeys - persistent - req 1000 vs - 250000 vc 5.1941 ms/op 22.033 us/op 235.74

🚀🚀 Significant benchmark improvement detected

Benchmark suite Current: c4c95f9 Previous: 241540f Ratio
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 39.428 ms/op 3.4839 s/op 0.01
phase0 processSlashings - 250000 worstcase 6.9683 ms/op 100.08 ms/op 0.07
Full benchmark results
Benchmark suite Current: c4c95f9 Previous: 241540f Ratio
state hashTreeRoot - No change 777.00 ns/op 878.00 ns/op 0.88
state hashTreeRoot - 1 full validator 776.00 ns/op 868.00 ns/op 0.89
state hashTreeRoot - 32 full validator 771.00 ns/op 883.00 ns/op 0.87
state hashTreeRoot - 512 full validator 739.00 ns/op 873.00 ns/op 0.85
state hashTreeRoot - 1 validator.effectiveBalance 752.00 ns/op 866.00 ns/op 0.87
state hashTreeRoot - 32 validator.effectiveBalance 768.00 ns/op 848.00 ns/op 0.91
state hashTreeRoot - 512 validator.effectiveBalance 744.00 ns/op 878.00 ns/op 0.85
state hashTreeRoot - 1 balances 748.00 ns/op 868.00 ns/op 0.86
state hashTreeRoot - 32 balances 779.00 ns/op 868.00 ns/op 0.90
state hashTreeRoot - 512 balances 755.00 ns/op 851.00 ns/op 0.89
state hashTreeRoot - 250000 balances 768.00 ns/op 882.00 ns/op 0.87
processSlot - 1 slots 62.649 us/op 64.710 us/op 0.97
processSlot - 32 slots 2.8962 ms/op 3.1689 ms/op 0.91
getCommitteeAssignments - req 1 vs - 250000 vc 5.1063 ms/op 6.6336 ms/op 0.77
getCommitteeAssignments - req 100 vs - 250000 vc 7.5203 ms/op 9.2732 ms/op 0.81
getCommitteeAssignments - req 1000 vs - 250000 vc 8.4459 ms/op 9.6479 ms/op 0.88
altair processBlock - 250000 vs - 7PWei normalcase 160.66 ms/op 142.61 ms/op 1.13
altair processBlock - 250000 vs - 7PWei worstcase 377.92 ms/op 342.08 ms/op 1.10
altair processEpoch - pyrmont_e62330 1.6495 s/op 747.24 ms/op 2.21
pyrmont_e62330 - altair beforeProcessEpoch 362.20 ms/op 123.62 ms/op 2.93
pyrmont_e62330 - altair processJustificationAndFinalization 138.10 us/op 132.25 us/op 1.04
pyrmont_e62330 - altair processInactivityUpdates 76.505 ms/op 86.294 ms/op 0.89
pyrmont_e62330 - altair processRewardsAndPenalties 107.43 ms/op 180.41 ms/op 0.60
pyrmont_e62330 - altair processRegistryUpdates 27.897 us/op 12.213 us/op 2.28
pyrmont_e62330 - altair processSlashings 6.0700 us/op 3.6660 us/op 1.66
pyrmont_e62330 - altair processEth1DataReset 5.8720 us/op 3.0810 us/op 1.91
pyrmont_e62330 - altair processEffectiveBalanceUpdates 818.25 ms/op 30.376 ms/op 26.94
pyrmont_e62330 - altair processSlashingsReset 30.324 us/op 16.377 us/op 1.85
pyrmont_e62330 - altair processRandaoMixesReset 43.430 us/op 23.443 us/op 1.85
pyrmont_e62330 - altair processHistoricalRootsUpdate 7.7440 us/op 3.8510 us/op 2.01
pyrmont_e62330 - altair processParticipationFlagUpdates 45.611 ms/op 53.844 ms/op 0.85
pyrmont_e62330 - altair processSyncCommitteeUpdates 5.0740 us/op 2.5080 us/op 2.02
pyrmont_e62330 - altair afterProcessEpoch 202.25 ms/op 149.46 ms/op 1.35
altair processInactivityUpdates - 250000 normalcase 208.40 ms/op 292.83 ms/op 0.71
altair processInactivityUpdates - 250000 worstcase 211.27 ms/op 290.45 ms/op 0.73
altair processParticipationFlagUpdates - 250000 anycase 83.820 ms/op 107.39 ms/op 0.78
altair processRewardsAndPenalties - 250000 normalcase 190.83 ms/op 361.20 ms/op 0.53
altair processRewardsAndPenalties - 250000 worstcase 189.73 ms/op 393.33 ms/op 0.48
altair processSyncCommitteeUpdates - 250000 417.93 ms/op 494.75 ms/op 0.84
Tree 40 250000 create 489.23 ms/op 710.92 ms/op 0.69
Tree 40 250000 get(125000) 256.37 ns/op 1.8362 us/op 0.14
Tree 40 250000 set(125000) 1.7247 us/op 1.9296 us/op 0.89
Tree 40 250000 toArray() 41.463 ms/op 48.023 ms/op 0.86
Tree 40 250000 iterate all - toArray() + loop 40.546 ms/op 48.148 ms/op 0.84
Tree 40 250000 iterate all - get(i) 105.87 ms/op 514.92 ms/op 0.21
MutableVector 250000 create 22.868 ms/op 30.630 ms/op 0.75
MutableVector 250000 get(125000) 14.214 ns/op 17.025 ns/op 0.83
MutableVector 250000 set(125000) 689.45 ns/op 848.56 ns/op 0.81
MutableVector 250000 toArray() 7.8583 ms/op 10.788 ms/op 0.73
MutableVector 250000 iterate all - toArray() + loop 7.8008 ms/op 11.250 ms/op 0.69
MutableVector 250000 iterate all - get(i) 4.2874 ms/op 4.8219 ms/op 0.89
Array 250000 create 6.1262 ms/op 7.1532 ms/op 0.86
Array 250000 clone - spread 2.7747 ms/op 5.0036 ms/op 0.55
Array 250000 get(125000) 0.90200 ns/op 1.6940 ns/op 0.53
Array 250000 set(125000) 0.95500 ns/op 1.0100 ns/op 0.95
Array 250000 iterate all - loop 138.58 us/op 209.91 us/op 0.66
aggregationBits - 2048 els - readonlyValues 247.70 us/op 324.58 us/op 0.76
aggregationBits - 2048 els - zipIndexesInBitList 55.685 us/op 40.774 us/op 1.37
ssz.Root.equals 1.6440 us/op 1.8000 us/op 0.91
ssz.Root.equals with valueOf() 1.8230 us/op 2.0150 us/op 0.90
byteArrayEquals with valueOf() 1.7250 us/op 1.9280 us/op 0.89
phase0 processBlock - 250000 vs - 7PWei normalcase 16.653 ms/op 18.496 ms/op 0.90
phase0 processBlock - 250000 vs - 7PWei worstcase 107.56 ms/op 112.05 ms/op 0.96
phase0 afterProcessEpoch - 250000 vs - 7PWei 325.20 ms/op 300.73 ms/op 1.08
phase0 beforeProcessEpoch - 250000 vs - 7PWei 934.29 ms/op 819.64 ms/op 1.14
phase0 processEpoch - mainnet_e58758 1.3111 s/op 1.2654 s/op 1.04
mainnet_e58758 - phase0 beforeProcessEpoch 833.89 ms/op 675.01 ms/op 1.24
mainnet_e58758 - phase0 processJustificationAndFinalization 132.64 us/op 130.34 us/op 1.02
mainnet_e58758 - phase0 processRewardsAndPenalties 113.62 ms/op 255.13 ms/op 0.45
mainnet_e58758 - phase0 processRegistryUpdates 100.61 us/op 143.50 us/op 0.70
mainnet_e58758 - phase0 processSlashings 6.2150 us/op 3.1810 us/op 1.95
mainnet_e58758 - phase0 processEth1DataReset 5.8680 us/op 3.1100 us/op 1.89
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 35.007 ms/op 32.417 ms/op 1.08
mainnet_e58758 - phase0 processSlashingsReset 44.548 us/op 19.907 us/op 2.24
mainnet_e58758 - phase0 processRandaoMixesReset 55.024 us/op 27.815 us/op 1.98
mainnet_e58758 - phase0 processHistoricalRootsUpdate 8.5330 us/op 3.7160 us/op 2.30
mainnet_e58758 - phase0 processParticipationRecordUpdates 37.289 us/op 16.688 us/op 2.23
mainnet_e58758 - phase0 afterProcessEpoch 289.54 ms/op 239.52 ms/op 1.21
phase0 processEffectiveBalanceUpdates - 250000 normalcase 30.061 ms/op 54.146 ms/op 0.56
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 39.428 ms/op 3.4839 s/op 0.01
phase0 processRegistryUpdates - 250000 normalcase 110.75 us/op 71.990 us/op 1.54
phase0 processRegistryUpdates - 250000 badcase_full_deposits 4.4748 ms/op 6.3869 ms/op 0.70
phase0 processRegistryUpdates - 250000 worstcase 0.5 2.9396 s/op 3.1595 s/op 0.93
phase0 getAttestationDeltas - 250000 normalcase 64.208 ms/op 59.576 ms/op 1.08
phase0 getAttestationDeltas - 250000 worstcase 63.557 ms/op 58.934 ms/op 1.08
phase0 processSlashings - 250000 worstcase 6.9683 ms/op 100.08 ms/op 0.07
shuffle list - 16384 els 13.861 ms/op 16.521 ms/op 0.84
shuffle list - 250000 els 198.80 ms/op 238.73 ms/op 0.83
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.6477 ms/op 2.4955 ms/op 1.06
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 8.4779 ms/op 547.73 us/op 15.48
getPubkeys - persistent - req 1000 vs - 250000 vc 5.1941 ms/op 22.033 us/op 235.74
BLS verify - blst-native 2.4524 ms/op 2.3246 ms/op 1.06
BLS verifyMultipleSignatures 3 - blst-native 5.3278 ms/op 4.7670 ms/op 1.12
BLS verifyMultipleSignatures 8 - blst-native 11.353 ms/op 10.294 ms/op 1.10
BLS verifyMultipleSignatures 32 - blst-native 39.234 ms/op 37.278 ms/op 1.05
BLS aggregatePubkeys 32 - blst-native 52.295 us/op 50.523 us/op 1.04
BLS aggregatePubkeys 128 - blst-native 221.07 us/op 194.79 us/op 1.13
getAttestationsForBlock 87.324 ms/op 96.716 ms/op 0.90
CheckpointStateCache - add get delete 13.841 us/op 25.551 us/op 0.54
validate gossip signedAggregateAndProof - struct 5.9145 ms/op 5.5563 ms/op 1.06
validate gossip signedAggregateAndProof - treeBacked 6.1840 ms/op 5.5206 ms/op 1.12
validate gossip attestation - struct 2.6394 ms/op 2.6041 ms/op 1.01
validate gossip attestation - treeBacked 2.8565 ms/op 2.6225 ms/op 1.09

by benchmarkbot/action

Simplify command

Fix PR issues

Use .rc

Add more accurate epoch perf tests

Re-arrange epoch tests with a real state

Fix perf type issues

Ensure state is valid

Review all spec tests

Fix hash computations

Fix processRewardsAndPenalties

Fix perf titles

Re-org imports

Fix get state script

Perf test hashing

Fix altair step

Review stfn caches and performance

Add SLOW CODE comments and whitespace

Cache effectiveBalances

Process altair attestations in batch

Don't keep withdrawalCredentials and pubkey in validators flat

Move flat arrays to epoch process only

Add exitQueue churn cache

Cache chrunLimit

Optimize beacon state transition

Move exitQueue cached values to epochCtx

Improve error message in processAttestation

Add comments to processBlock functions

Allow to set validators and balances

Convert from struct to tree properly

Silence effectiveBalances type warnings

Add data representation tradeoffs

Add more docs

Fix type errors

More fixes


Use better state assertion in spec tests


Re-add delete condition


Fix type issues


Update perf tests


Fix lint warnings
@philknows philknows added the scope-memory Issues to reduce and improve memory usage. label Nov 23, 2021
@philknows philknows mentioned this pull request Nov 29, 2021
10 tasks
@philknows philknows added the prio-medium Resolve this some time soon (tm). label Nov 29, 2021
@dapplion
Copy link
Contributor Author

Replaced by #3760

@dapplion dapplion closed this Feb 15, 2022
@dapplion dapplion deleted the dapplion/drop-flat-caches branch August 6, 2022 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-medium Resolve this some time soon (tm). scope-memory Issues to reduce and improve memory usage.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants