Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][dvc] Drop partitions asynchronously #1310

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

kvargha
Copy link
Contributor

@kvargha kvargha commented Nov 15, 2024

Summary, imperative, start upper case, don't end with a period

When a storage node is transitioning from LEADER -> STANDBY -> OFFLINE -> DROPPED, a race condition can occur. Specifically, the DROPPED state transition may be executed synchronously before the other state transitions are processed (LEADER -> STANDBY -> OFFLINE are executed asynchronously). This results in the store partition being deleted prematurely. Consequently, when the LEADER -> STANDBY message is eventually processed, it triggers a PersistenceFailureException since the storage partition no longer exists.

The solution for this is to drop the store partition asynchronously by adding a DROP_PARTITION message to the consumerActionsQueue if the ingestion task is still running. In the case that it's not running (this can happen if it was killed), the store partition will be dropped synchronously.

How was this PR tested?

Added unit and integration tests.

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

final String topic = veniceStore.getStoreVersionName();

if (isPartitionConsuming(topic, partitionId)) {
throw new VeniceException("Tried to drop storage partition that is still consuming");
Copy link
Contributor

@lluwm lluwm Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception could cause the ST to be in the ERROR state, is that right? I read function stopConsumptionAndWait and, today, we simply log a warning message if consumption couldn't be stopped in time. This sounds like a behavior change in the new PR and we probably want to be careful about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STANDBY->OFFLINE issues an UNSUBSCRIBE message, and so will stopConsumptionAndWait.

By the time SIT processes the DROP_PARTITION message, it should have been already unsubscribed.

I think it's safe to remove this check. What do you think?


try (AutoCloseableLock ignore = topicLockManager.getLockForResource(topic)) {
StoreIngestionTask ingestionTask = topicNameToIngestionTaskMap.get(topic);
if (ingestionTask != null && ingestionTask.isRunning()) {
Copy link
Contributor

@lluwm lluwm Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking of a race condition that after we add DROP_PARTITION actions to the queue, then SIT terminates due to some exceptions (as we see several cases today) before executing all the remaining actions from the queue and it could probably cause some partition leaks. If this race is possible, we probably need to add some logic in the SIT to make sure that all DROP_PARTITION actions have to be executed before it can terminate itself, or maybe some other measures to avoid it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants