Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage-manager): replication supports wildcard updates #1499

Merged
merged 7 commits into from
Oct 11, 2024

Conversation

J-Loudet
Copy link
Contributor

@J-Loudet J-Loudet commented Oct 2, 2024

This PR adds support for Wildcard updates (be it deletion or addition) when replicating storage.

⚠️ Commits are meant to be reviewed individually (once the PR is no longer a draft).
⚠️ This PR is meant to be rebased not squashed.

Copy link

github-actions bot commented Oct 2, 2024

PR missing one of the required labels: {'breaking-change', 'bug', 'internal', 'new feature', 'dependencies', 'documentation', 'enhancement'}

@J-Loudet J-Loudet added the new feature Something new is needed label Oct 2, 2024
@J-Loudet J-Loudet force-pushed the feat/storage-manager/replication-support-wildcard-update branch 11 times, most recently from bd6bf44 to baaffcc Compare October 7, 2024 13:47
@J-Loudet J-Loudet marked this pull request as ready for review October 7, 2024 13:48
J-Loudet added a commit that referenced this pull request Oct 8, 2024
- [ ] Unit test: changes in #1499 will simplify adding Wildcard Updates

The previous implementation was always performing a `get` on the Storage
when checking if a received publication could be overridden by a
Wildcard Update.

This commit changes that behaviour and removes the call to the
Storage. In addition to potentially creating more network exchanges,
this was also generating "warning" logs during the alignment process
when Wildcard Updates are involved.

* plugins/zenoh-plugin-storage-manager/src/storages_mgt/mod.rs
* plugins/zenoh-plugin-storage-manager/src/storages_mgt/service.rs
* plugins/zenoh-plugin-storage-manager/src/storages_mgt/tests/service.tests.rs

Signed-off-by: Julien Loudet <[email protected]>
J-Loudet added a commit that referenced this pull request Oct 8, 2024
- [ ] Unit test: changes in #1499 will simplify adding Wildcard Updates

The previous implementation was always performing a `get` on the Storage
when checking if a received publication could be overridden by a
Wildcard Update.

This commit changes that behaviour and removes the call to the
Storage. In addition to potentially creating more network exchanges,
this was also generating "warning" logs during the alignment process
when Wildcard Updates are involved.

* plugins/zenoh-plugin-storage-manager/src/storages_mgt/mod.rs
* plugins/zenoh-plugin-storage-manager/src/storages_mgt/service.rs
* plugins/zenoh-plugin-storage-manager/src/storages_mgt/tests/service.tests.rs

Signed-off-by: Julien Loudet <[email protected]>
Comment on lines +295 to +296
"Failed to retrieve data associated to key < {:?} >: {e:?}",
event_to_retrieve.key_expr()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log currently looks something like

2024-10-08T14:58:30.624185Z ERROR tokio-runtime-worker ThreadId(02) zenoh_plugin_storage_manager::replication::core::aligner_query: Failed to retrieve data associated to key < Some("wildcard/delete/a/1") >: "Key Some(\"wildcard/delete/a/1\") is not present"

Maybe we should unwrap if possible, for a cleaner error message.

Copy link
Contributor

@oteffahi oteffahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fix the issue we found during our in-person review: the "reanimation of a key" if it was deleted and followed by a wildcard put that matches it.

@J-Loudet J-Loudet force-pushed the feat/storage-manager/replication-support-wildcard-update branch 2 times, most recently from 3e35dac to 902f984 Compare October 10, 2024 13:35
This change makes clear that we are using Zenoh's Result.

* plugins/zenoh-plugin-storage-manager/src/replication/log.rs: renamed
  `Result` to `ZResult`.

Signed-off-by: Julien Loudet <[email protected]>
This commit changes the signature of the `prefix` function to accept
`Option` for both the prefix and the key expression to
prefix. Conversely, this function now returns an error if both are set
to `None`.

This change is in anticipation of the support for Wildcard updates when
replicating Storage: in order to check if a Wildcard should be applied
on a key expression, we need to `prefix` the key. This change avoids
repeating the checks and raising this error at all the call sites.

* plugins/zenoh-plugin-storage-manager/src/lib.rs: changed the signature
  of the `prefix` function to accept two `Option` and return a `Result`.
* plugins/zenoh-plugin-storage-manager/src/storages_mgt/service.rs:
  updated the call to `prefix`.

Signed-off-by: Julien Loudet <[email protected]>
This commit changes the data structure used to store the `SubInterval`s
within an `Interval` from a `HashMap` to a `BTreeMap`.

The main purpose of this change is to diminish the number of operations
required to search for events: given indexes for `Interval` and
`SubInterval`, a wildcard update or delete need to be applied
"backward", i.e. on any event that belongs in an `Interval` of lower or
equal index (and in the `Interval` of equal index, in any `SubInterval`
with a lower or equal index). As a `BTreeMap` stores its entries in
order, this allows us to stop considering `Interval` / `SubInterval` the
moment the index is greater than that of the considered event.

* plugins/zenoh-plugin-storage-manager/src/replication/classification.rs:
  changed, in the `Interval` structure, the data structure used to store
  the `sub_intervals` to a `BTreeMap`.

Signed-off-by: Julien Loudet <[email protected]>
The purpose of this change is to ensure that we don't manipulate an
Interval or SubInterval without updating its Fingerprint (a mistake I
was able to do…).

Hence, accessors were added to these two structures.

* plugins/zenoh-plugin-storage-manager/src/replication/classification.rs:
  - Make the inner fields of the `Interval` and `SubInterval` private.
  - Added accessors for both.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_query.rs:
  used the accessors.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_reply.rs:
  used the accessors.

Signed-off-by: Julien Loudet <[email protected]>
* plugins/zenoh-plugin-storage-manager/src/replication/core.rs:
  - Log the `DigestDiff` when a misalignment is detected.
  - Removed the log indicated that an alignment query is received as it
    is redundant with later logs.

* plugins/zenoh-plugin-storage-manager/src/replication/log.rs: added a
  trace when an event is added to the Replication Log.

* plugins/zenoh-plugin-storage-manager/src/storages_mgt/service.rs:
  - Only print the key expression of the sample being processed.
  - Remove the redundant log indicating the the sample is being
    processed.

Signed-off-by: Julien Loudet <[email protected]>
This commit refactors, renames and changes the method `reply_events` to
`reply_event_retrieval`: it now only sends a single
`AlignmentReply::Retrieval` to the Replica that requested data.

The code of the method was also reworked to make it clearer and less
error-prone: there are no `continue;` statement in the middle which
makes the flow less "jumpy".

These changes prepare the introduction of the Wildcard Update.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_query.rs:
  refactor, rename and change the method `reply_events` to
  `reply_event_retrieval`.

Signed-off-by: Julien Loudet <[email protected]>
@J-Loudet J-Loudet force-pushed the feat/storage-manager/replication-support-wildcard-update branch 3 times, most recently from 76f46cd to 300b7ab Compare October 10, 2024 16:14
Copy link
Contributor

@oteffahi oteffahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unnecessary logs. Aside from that it LGTM.

@J-Loudet J-Loudet force-pushed the feat/storage-manager/replication-support-wildcard-update branch from 300b7ab to 39b055b Compare October 10, 2024 16:17
This commit adds support for Wildcard Updates (be it Delete or Put) when
aligning replicated Storage.

To make this feature possible the following changes were made:

- A new enumeration `Action` was introduced. This enumeration builds on
  top of `SampleKind` and adds two variants: `WildcardPut` and
  `WildcardDelete`, each of these variants contains their full key
  expression.
  We track the full key expression to not have to deal with `Option` and
  the `strip_prefix`: there are cases where the `strip_prefix` cannot be
  removed from a Wildcard Update — yet the Wildcard Update should be
  recorded.
  For instance, if a storage subscribes to "test/replication/**" and has
  its strip prefix set to "test/replication", a delete on "test/**"
  should be recorded while the `strip_prefix` cannot be removed.

- The `Event` and `EventMetadata` structures were reworked to avoid
  repetitions and to prevent modifications without updating the
  Fingerprint.

- The keys of the Bloom Filter, `LogLatest` and `LatestUpdates` was
  changed from `Option<OwnedKeyExpr>` to the new structure
  `LogLatestKey`. This is because we need to remember, for Wildcard
  Updates, the last Put and the last Delete on the exact same key
  expression. This modification impacted many parts of the code base,
  the signature of many functions that were accepting a key expression
  and a Timestamp now requires an `Event` such that we can generate the
  associated `LogLatestKey` -- through the dedicated method `log_key()`.

- In addition to the change above, and to still support keeping track of
  the last Put and last Delete on the same Wildcard Update, the
  `tombstones` and `wildcard_updates` fields of the `StorageService`
  were, respectively, renamed `wildcard_deletes` and
  `wildcard_puts`. The purpose of the former `tombstones` was changed to
  keep track of the Wildcard Delete.
  As a consequence the `is_deleted` method was removed. This futuristic
  deletion check will however still be performed, by checking the
  `wildcard_deletes`.

- As a consequence of the change above, the logic of the method
  `overriding_wild_update` was completely reworked: the check for an
  overriding Wildcard Delete is different to the check for an overriding
  Wildcard Put. Plus, they don't apply to all `Action`.

- The `LogLatest`, `Interval` and `SubInterval` structures now have the
  the method `remove_events_overridden_by_wildcard_update`: this method
  removes all the Events that are impacted by a Wildcard Update and
  updates the Fingerprint of the impacted Interval / SubInterval.

- The core `Replication` structure now keeps an `Arc` over the
  `StorageService` such that it can add Wildcard Updates to it when it
  receives some.

- The visibility of the following field / structures were changed to
  `pub(crate)` such that the Replication can access them:
  - The `Update` structure that contains the payload and timestamp
    associated with a Wildcard Update.
  - The `configuration`, `wildcard_puts`, and `wildcard_deletes` fields
    of the `StorageService` structure.

- The signature of the method `register_wildcard_update` was changed: it
  no longer extracts all the metadata from the `sample` argument as,
  when called from the Replication Aligner, these values are not
  "correct" (they are linked to a reply and not to the contained
  payload).

* plugins/zenoh-plugin-storage-manager/src/replication/classification.rs:
  - Added the method `remove_events_overridden_by_wildcard_update` to
    the `Interval` and `SubInterval` structures.
  - Added the method `remove_event` to the `Interval` and `SubInterval`
    structures.
  - Updated the keys of the Hash-based structures to use the
    `LogLatestKey` instead of an `Option<OwnedKeyExpr>`.
  - Updated methods to pass an `&EventMetadata` or `&Event` to be able
    to obtain their `LogLatestKey`.

* plugins/zenoh-plugin-storage-manager/src/replication/configuration.rs:
  added the `prefix` accessor.

* plugins/zenoh-plugin-storage-manager/src/replication/core.rs:
  - Changed the `Replication` structure to keep track of the
    `StorageService`. The latter is used to access and add/remove
    Wildcard Updates.
  - Added the method `remove_events_overridden_by_wildcard_updates`
    that takes a `HashMap` as this logic is shared for the latest cache
    and the Replication Log.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_query.rs:
  updated the `reply_event_retrieval` method to handle Wildcard Updates.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_reply.rs:
  - Added some traces to ease debugging.
  - Added the method `needs_further_processing` that, given an
    EventMetadata, checks if it needs to be processed, processes it if it
    can and otherwise returns true -- indicating that further processing
    is needed.
  - Added the method `apply_wildcard_update` that applies a Wildcard
    Update.
  - Added the method `is_overridden_by_wildcard_update` that checks if
    there is a Wildcard Update that overrides the provided key
    expression for the provided timestamp.
  - Added the method `store_event_overridden_by_wildcard_update` that
    overrides the provided EventMetadata with the Wildcard Update and
    stores the result.

* plugins/zenoh-plugin-storage-manager/src/replication/log.rs:
  - Added the `Action` enumeration.
  - Added the `ActionKind` enumeration.
  - Added the `LogLatestKey` structure.
  - Reworked the `Event` and `EventMetadata` structures.
  - Added the method `remove_events_overridden_by_wildcard_update` to
    the `LogLatest` structure.

* plugins/zenoh-plugin-storage-manager/src/replication/mod.rs: make the
  `Action` and `LogLatestKey` visible from the `replication` module.

* plugins/zenoh-plugin-storage-manager/src/replication/service.rs:
  changed the `spawn_start` function to take an `Arc<StorageService>`
  instead of a reference.

* plugins/zenoh-plugin-storage-manager/src/replication/tests/classification.test.rs:
  updated unit tests due to changes on the methods tested.

* plugins/zenoh-plugin-storage-manager/src/replication/tests/log.test.rs:
  added new unit tests to ensure that the generation of an Event happens
  as expected.

* plugins/zenoh-plugin-storage-manager/src/storages_mgt/mod.rs: wrap the
  `StorageService` inside of an `Arc` such that the `Replication` can use
  it in its tasks.

* plugins/zenoh-plugin-storage-manager/src/storages_mgt/service.rs:
  - Changed the visibility of the `Update` structure to `pub(crate)` and
    added accessors.
  - Changed the visibility of the `configuration` and `wildcard_updates`
    fields of the `StorageService` structure to `pub(crate)` as the
    `Replication` uses them when dealing with Wildcard Updates.
  - Changed the signature of the method `register_wildcard_update` to
    not extract the metadata from the `Sample` -- as the values are not
    correct when processing an Alignment reply.
  - Changed the visibility of the methods `register_wildcard_update` and
    `overriding_wild_update` to `pub(crate)` as they are used by the
    Replication to deal with Wildcard Updates.
  - Reworked the logic of the `overriding_wild_update` to handle
    Wildcard Delete and Wildcard Put separately. In addition, this
    method no longer queries the database every time an overriding
    Wildcard Update is found.
  - Reworked the `register_wildcard_update` method.
  - Removed the `mark_tombstone` method.

Signed-off-by: Julien Loudet <[email protected]>
@J-Loudet J-Loudet force-pushed the feat/storage-manager/replication-support-wildcard-update branch from 39b055b to 51b795e Compare October 11, 2024 07:37
@J-Loudet J-Loudet removed the request for review from Charles-Schleich October 11, 2024 07:51
@gabrik gabrik merged commit ae9719f into main Oct 11, 2024
24 checks passed
@gabrik gabrik deleted the feat/storage-manager/replication-support-wildcard-update branch October 11, 2024 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature Something new is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants