Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock service #315

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Lock service #315

wants to merge 17 commits into from

Conversation

yfei-z
Copy link
Contributor

@yfei-z yfei-z commented Oct 25, 2024

A RAFT implementation of lock service

@jabolina
Copy link
Member

Hey, @yfei-z. I'll take a look. Some things missing are a design document in the ./doc/design/ folder and expanding the Java docs at the LockService class. It should include information such as the guarantees, what happens in case of failures (node holding a lock, leader loss, majority loss, etc.), and, if necessary, usage patterns.

On a side note, I am skeptical about distributed locks, even in consensus. I am not the author, but I agree with https://belaban.blogspot.com/2020/11/i-hate-distributed-locks.html

@yfei-z
Copy link
Contributor Author

yfei-z commented Oct 28, 2024

Yes, I will complete the docs.

@yfei-z
Copy link
Contributor Author

yfei-z commented Nov 6, 2024

Hi @jabolina. I have finished the initial doc, and I will continue to improve it if I think of anything else, you can take a look.

@jabolina
Copy link
Member

jabolina commented Nov 7, 2024

Thanks for the work, @yfei-z! I'll take a look.

Copy link
Member

@jabolina jabolina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @yfei-z, I've done passing over the LockService class. I'll look into the tests next.

src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
Comment on lines 371 to 373
protected void cleanup() {
lockStatus.forEach((k, v) -> notifyListeners(k, v, NONE, true));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this notification is submitted when the current node leaves or the majority is lost. To identify the real cause, we should include two different method in the Listener interface.

Copy link
Contributor Author

@yfei-z yfei-z Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to make both of them real, that means no matter the member is disconnected or partition into a minority subgroup, it will also be unlocked by reset command immediately or eventually. But I found a problem when I am writing the design document, I'm working on it, hope it could be fixed otherwise I will consider to separate the notifications, but it's very different from what I intent to do, so it will be a big change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has been fixed, and I adjust the test as well. This method has been renamed to resign, although it notifies listeners not based on what really happened in state machine, but it will immediately or eventually happen in state machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrrrmm, a bit tricky. I believe we would want to be notified in these cases (shutdown, partitioning) but doing so from outside the state machine seems a recipe for much more work and patching corner cases. Would it be easier to handle with dedicated methods? Notifications come from the state machine and the other methods to notify of what just happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, the state machine should release the lock if the holder leaves the cluster unexpectedly, otherwise the lock service wouldn't be that useful, A left member can't know what really happen in state machine, perhaps the leader has released the lock it was holding, or the cluster has lost its leader, one way or another the member can only assume that the lock it was holding may has its new holder, therefore, regardless of how many types of notifications there are, the only thing the user can do is treat it as unlocked. so I think the real question is how to maintain a consistent state between state machine and the left member, although there are a lot of corner cases. I have some analysis in the design document, maybe it doesn't cover all cases, but I think it is a pessimistic approach, which means the state machine will release the lock as much as possible, the holding status will only be retained under certain conditions.
Because the notification mechanism has changed to response-triggered, I added some query scenarios to deal with the inconsistent state caused by over-pessimism, the purpose is to synchronize the states after the members are reconnected.

src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
src/org/jgroups/raft/blocks/LockService.java Outdated Show resolved Hide resolved
Comment on lines 700 to 726
if (curr != HOLDING && holder != null) {
status = curr;
var handler = unlockHandler;
if (handler != null) try {
handler.accept(this);
} catch (Throwable e) {
log.error("Error occurred on unlock handler", e);
}
} else if (curr != NONE && acquirers.get() == 0) {
status = curr;
var handler = lockHandler;
if (handler != null) try {
handler.accept(this);
} catch (Throwable e) {
log.error("Error occurred on lock handler", e);
}
} else if (prev == WAITING) {
delegate.lock();
try {
if (status == WAITING) {
status = curr;
notWaiting.signalAll();
}
} finally {
delegate.unlock();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be handling some corner cases I wouldn't expect ever to happen. I would expect this method to just assert prev is equal to the Mutex's state and do a switch with the curr value.

IIUC, the inconsistencies can exist if the user utilize the Mutex API and the LockService simultaneously? If so, I would say to make the methods in the LockService private and expose them only through the Mutex class. You could also push the Mutex outside and create a RaftLock or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notified status could be far behind returned status from command executing, it can't be directly set. But WAITTING status is special, if mutex is in WAITTING status, then all calling threads will wait the notification, before that no further commands will be executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutex is just a Lock representation of lock service. I think from user's point of view, his requirement is either exclusive thread or exclusive process, even if both are used, they can be separated by different lockIds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State inconsistency is inevitable, it may be caused by a disconnection or partition, etc. not like lock service it can be done by a notification to the listener, it more like a error to mutex, so there are unexpected handlers.
Like I mentioned status can not be set directly by the notification of lock service, so above set status code do have problem. I will fix it, maybe set the status via notification only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to make this update only via notifications from the LockService when handling commands. This might still be tricky when installing/recovering from the persistent state. The delegate lock might not be held, which could lead to some internal inconsistency. The mutex must be completely deterministic, regardless of whether or not commands are handled from the log or the user. I believe that making the updates come from the LockService might help.

Copy link
Contributor Author

@yfei-z yfei-z Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the notifications source from logs applying to command responses, logs applying is just a tip for the client to query the latest state from server, a read-only QUERY command is added. It becomes a pure client mode I think, local status and notifications are based only on command responses. This is based on the fact that command response callbacks are synchronous and ordered, no matter it's leader or follower.
Currently, the lockStatus method in lock service returns the latest status from the client's perspective, just like the status field in Mutex. This is based on that client's previous status will be cleared before it connect or reconnect or resume from partition.
For the mutex, the status change will be notified just before the executing command complete with the same status, so the notified status is just the value that about to be set if the command is sent by the mutex.

src/org/jgroups/raft/blocks/LockService.java Show resolved Hide resolved
Copy link
Member

@jabolina jabolina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another pass, but this new query command is making things more fuzzy. I see the value of having it to know the current status, but I don't understand why using it to trigger a local notification in the Mutex.

For instance, applying an UNLOCK command that could make the local node the new holder is delayed until the QUERY command. You could have many commands in between. Additionally, the mix of variables, using a variable updated by the client within the state machine will make it non-deterministic. When you invoke the notify method locally, it could be in any state!

Am I missing some piece as to why the QUERY is needed?

Comment on lines +76 to +77
* The {@link Mutex} is a distributed implementation of {@link Lock}. It based on the lock service, a thread is holding
* the mutex also means the member is holding the lock in the lock service. There is only one {@link Mutex} instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a thread is holding the mutex also means the member is holding the lock in the lock service.

I guess not necessarily. It could hold the mutex locally and be waiting for the global lock.

Copy link
Contributor Author

@yfei-z yfei-z Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting doesn't means holding. Holding internal local lock doesn't means holding the mutex. Before return from lock methods all states are internal.

Comment on lines +244 to +246
if (address.equals(member)) {
onCommit(lockId, prev, next, false);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the combinations of prev and next in this method triggers the query method. We could safely remove this call.

}
}

protected LockStatus notifyListeners(long lockId, LockStatus curr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this method is now called with a .thenApply in the request CF. I don't think you'll have any ordering guarantee anymore, it could notify the client with any status combination at this point, even ones that don't make sense. Isn't this required for correctness?

Copy link
Contributor Author

@yfei-z yfei-z Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In leader, the calling thread will be the RAFT working thread. In follower it's the thread that pass up the message that send from leader, I think they are both ordered and synchronized even stable because that's what JGroups do. A lot of things depend on that, like ELECTION.

if (prev == curr) return;
// In followers, logs and responses of multiple commands can be included in one batch message, since RAFT is
// below REDIRECT in protocol stack, the commands applying may occur before the responses completing.
if (curr == HOLDING && (prev == WAITING || prev == null && lockStatus.get(lockId) != HOLDING)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check on the lockStatus won't be deterministic. I see the values on that map are added by the client after receiving a response.

Copy link
Contributor Author

@yfei-z yfei-z Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It filter out the unnecessary query when commit with a snapshot, can you give an example of how it goes wrong?
I think the first thing to ensure is not to miss the query for the status change, but redundancy is acceptable.

Comment on lines +339 to +340
LockStatus status = lockStatus.get(lockId);
if (curr == NONE && status != null || curr == HOLDING && status != HOLDING) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check as well. It depends on a variable updated outside the state machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as long as it doesn't cause the client to miss any queries.

@yfei-z
Copy link
Contributor Author

yfei-z commented Dec 31, 2024

I think the problem of the query command is that it could be failed. I will reconsider it. I am quit busy recently, I will do it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants