Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: trimmed metadata hash comparison fails in is_codegen_valid_for #1306

Merged
merged 7 commits into from
Dec 6, 2023

Conversation

tadeohepperle
Copy link
Contributor

Fixes #1286

What is the bug:

Metadata can be trimmed with Metadata::retain(..). This is done by the CLI tool, e.g. subxt metadata --url wss://rpc.polkadot.io --pallets "ElectionProviderMultiPhase,System" retains only the specified pallets.
When comparing the hash of trimmed metadata (metadata that contains only a limited number of pallets/runtime apis) with the hash of the original metadata using the hashers only_these_pallets / only_these_runtime_apis, the hashes were different. This comparison was made in the generated is_codegen_valid_for function.

Reason for the bug

in Metadata::retain(..) 3 things happen:

  • the unneeded pallets/apis are removed at the top level
  • all types that only these pallets/apis used are removed as well
  • the top level enums (call, event and error) are Variant types. The variants relating to unneeded pallets/apis are removed. This happens in retain_pallets_in_runtime_outer_types().

The last point is what caused the bug:
When specifying only_these_pallets for a MetadataHasher, the variant types of the top level enums (call, event and error) were modified to discard unneded pallets before the hash is calculated. HOWEVER the same types can be included in other places of the metadata. In @niklasad1 case for the staking-miner, the "Events" storage entry of the "System" pallet, had a type of Vec<Event>, where Event is the event top level type. But we did not go through all types of the metadata checking for each one if they are a top level variant and modifying it accordingly before computing its hash.
==> This caused the bug.

Solution

To solve the bug, we need to change how specific_pallets is used in the MetadataHasher::hash(..) function.
We compute the hashes of the 3 outer enums first, trimming them down to only the variants that refer to the pallets in specific_pallets. In the rest of the function we often compute the hash of some type. Every time this happens, we pass the type ids and hashes of the 3 outer enums down in these recursive hash computations. Every time an outer enum type is encountered we just use the hash computed at the start.

Other change: I severly limited all caching (used to prevent infinite recursion). Now a new cache is created every time we resolve the hash of some type. -> Less Cache reuse than before -> Worse Performance but better consistency.

Alternative solution (not implemented)

We could also just clone and trim the metadata when calling MetadataHasher::hash() and specific pallets/apis are set.
This is probably less performant, but maybe in the end easier and more consistent.

Testing

Added a test to confirm this works and manually tested it for the example Niklas provided.

* subxt: Remove unstable lints that cause compile warnings

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Switch to workspace lints

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Fix codec package at root level

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Move profiles to the root level

Signed-off-by: Alexandru Vasile <[email protected]>

* Fix lightclient and metadata crates

Signed-off-by: Alexandru Vasile <[email protected]>

* Revert "cargo: Fix codec package at root level"

This reverts commit cdf9e16.

* Fix complexity clippy

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Remove lints to be replaced by `cargo machete`

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Remove unused dependencies (detected by machete)

Signed-off-by: Alexandru Vasile <[email protected]>

* ci: Add machete step

Signed-off-by: Alexandru Vasile <[email protected]>

* cargo: Bump rust version

Signed-off-by: Alexandru Vasile <[email protected]>

* ci: Rename machete step

Signed-off-by: Alexandru Vasile <[email protected]>

* ci: Rename cargo machete step

Signed-off-by: Alexandru Vasile <[email protected]>

---------

Signed-off-by: Alexandru Vasile <[email protected]>
@tadeohepperle tadeohepperle marked this pull request as ready for review December 4, 2023 16:51
@tadeohepperle tadeohepperle requested a review from a team as a code owner December 4, 2023 16:51
@@ -75,6 +75,7 @@ fn get_field_hash(
registry: &PortableRegistry,
field: &Field<PortableForm>,
cache: &mut HashMap<u32, CachedHash>,
precomputed: Option<&HashMap<u32, [u8; HASH_LEN]>>,
Copy link
Member

@niklasad1 niklasad1 Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit hard to distinguish between "cache and precomputed" here, to me those seems very similar and wonder if it makes sense to merge them to single type instead?

It looks like cache is an empty hashmap in plenty of places and precomputed is an Option... would be nice unify that if possible 🙏

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's confusing, although I wouldn't merge it with the cache (see comment below :))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a type-alias for hash would nice to avoid repeating [u8; HASH_LEN]

@@ -207,11 +215,35 @@ impl CachedHash {
}

/// Obtain the hash representation of a `scale_info::Type` identified by id.
pub fn get_type_hash(
pub fn get_type_hash(registry: &PortableRegistry, id: u32) -> [u8; HASH_LEN] {
// Dev-Note: If you have trimmed metadata and want to propagate the hashes of the outer enums,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment could be moved up on the function docs itself

)
}

bytes
}

/// Obtain the hash representation of the `frame_metadata::v15::OuterEnums`.
/// Also returns a HashMap containing the individual hashes of the 3 otuer enums.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Also returns a HashMap containing the individual hashes of the 3 otuer enums.
/// Also returns a HashMap containing the individual hashes of the 3 outer enums.

registry,
signed_extension.additional_ty,
outer_enums_hashes,
),
)
}

bytes
}

/// Obtain the hash representation of the `frame_metadata::v15::OuterEnums`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer:

/// Obtain the hash representation of the `frame_metadata::v15::OuterEnums` 
/// and individual hashes of the outer enums.
///
/// The individual hashes may be used to calculate recursive hash computations
///

@jsdw
Copy link
Collaborator

jsdw commented Dec 5, 2023

Generally looks good to me, but just a few bits:

Right now we have get_type_hash, get_type_hash_with_precomputed and get_type_hash_recurse. My ideal state would be that there weren't so many ways to get a type hash, but failing that, the short named one should be the default IMO, so I'd rename get_type_hash => get_type_hash_without_precomputed and get_type_hash_with_precomputed => get_type_hash.

As Niklas said above, the presumputed Option<HashMap<..>> sitting next to the cache is confusing I think. I wouldn't merge it with the cache (we want to make a cache only once and not provide a way to create a cache such that it'll be reused if possible). What I'd do is make a new type (which also doesn't need to be a hashmap; we could optimise it since there are exactly 3 entries). Maybe something a bit like:

struct TopLevelEnumHashes(Option<[(u8, [u8; HASH_LEN]); 3]>);

impl TopLevelEnumHashes {
    pub fn new(
        // or take in metadata and this struct works out the hashes from there; that's prob cleaner...
        error_hash: (u8, [u8; HASH_LEN]),
        event_hash: (u8, [u8; HASH_LEN]),
        call_hash: (u8, [u8; HASH_LEN])
    ) -> Self { .. }

    pub fn empty() -> Self {
        Self(None)
    }

    pub fn get(&self, idx: u8) -> Option<[u8; HASH_LEN]> {
        self.0.and_then(|hashes| 
            hashes.iter().find(|(i, _hash)| idx == i).map(|(_i, hash)| hash)
        )
    }
}

Then it's clear in all of the function signatures that use it that it's different from the cache, and if you had an empty() version of it then you could maybe scrap the niche get_type_hash_without_precomputed and in those cases just pass empty in to them.

@jsdw
Copy link
Collaborator

jsdw commented Dec 5, 2023

Side note: weird test failure! I hope that doesn't relate to my unstable backend submit_extrinsic change! hopefully will work on a rerun but we should keep a close eye on that!

Comment on lines 311 to 318
/// Hash representations of the `frame_metadata::v15::OuterEnums`.
pub struct OuterEnumHashes {
call_hash: (u32, Hash),
error_hash: (u32, Hash),
event_hash: (u32, Hash),
}

impl OuterEnumHashes {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing; perhaps move this to a separate outer_enum_hashes.rs file, just so that the "pub"s are respected :)

Looks great otherwise though!

.hash();
let hash_trimmed = MetadataHasher::new(&trimmed_metadata).hash();

assert_eq!(hash, hash_trimmed);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this test have failed before the chanegs in this PR? I thought you'd need some recursive types or something to make it fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test would have failed before the PR. The reason for the fail would have been that the metadata constructed here contains a Vec<Events> as a storage entry (which is an outer enum at the same time).

The recursive types issue we talked about is actually kind of seperate from this issue. It only happened to show that we had this recursive types issue in the solution I used in the first iteration of this PR (which I now closed): #1301

So hopefully this PR solves both issues at once. We have always had a test for the recursive types, but it only passed because it did not cover critical cases, so it just happened to be a bad test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah I see, gotcha! Is there anything we should add to the recursive test to cover the fix here too in that case? :D

outer_enum_hashes: &OuterEnumHashes,
) -> Hash {
// If the type is part of `precomputed` hashes, return the precomputed hash instead:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

) -> [u8; HASH_LEN] {
outer_enum_hashes: &OuterEnumHashes,
) -> Hash {
// If the type is part of `precomputed` hashes, return the precomputed hash instead:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps, tweak this comment now when "precomputed hashes" are called "outer_enum_hashes"?

Copy link
Member

@niklasad1 niklasad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now

@jsdw
Copy link
Collaborator

jsdw commented Dec 5, 2023

Almost ready to merge I think; couple of small comments from Niklas to check out, move the outer_enum_hashes to a new file and confirm whether the test actually failed before or not and it's all good then from me :)

}

impl OuterEnumHashes {
/// Constructs new `OuterEnumHashes` from metadata. If `only_these_variants` is set, the enums are stripped down to only these variants, before their hashes are calculated.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; could break this doc on multiple lines

@tadeohepperle
Copy link
Contributor Author

Thank you guys for reviewing, hmm interestingly the Unstable Backend Test time-outed again after 6h.

@jsdw
Copy link
Collaborator

jsdw commented Dec 6, 2023

Thank you guys for reviewing, hmm interestingly the Unstable Backend Test time-outed again after 6h.

I'm going to look into that now a bit and see if I can find out what's going on there! It may be a more general issue that we'd rather not have in a release!

Copy link
Collaborator

@jsdw jsdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd have just had utils/outer_enum_hashes rather than the extra folder but it's all good!

@jsdw jsdw merged commit c3e02e7 into master Dec 6, 2023
10 of 11 checks passed
@jsdw jsdw deleted the tadeohepperle/bugfix-trimmed-metadata-enums branch December 6, 2023 15:41
@jsdw jsdw mentioned this pull request Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

codegen: is_codegen_valid_for doesn't work as expected for "trimmed metadata"
4 participants