-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release in-memory log space based on entries size #71
Comments
Alternatively, prove that the current strategy in |
But it doesn't look like we're safe as is. Say, the max size is 64 MB. Consider the following sequence of appends / truncations (numbers are entry sizes in MB):
First, 17 entries of size 2MB are appended, total is 34 MB (below the limit). The slice Then, a truncate 2MB, and a sequence of 8 (append 4MB, truncate 2 x 2MB) follows. During this sequence, Then, a sequence of 4 (append 8MB, truncate 2 x 4MB) follows. Same thing. Eventually, we fill up the The total memory usage of this slice by the time we reallocate is ~ 32MB * 5. More generally, an example can be constructed with O(B log B) bytes usage, for a configured max size B. We would like to reduce this to O(B), something like 2 * B. |
I am taking a look at this issue. However, from what I've learned, I don't think there is a way to get the length of entire backing array of a slice from the slice itself. That means we will also have to maintain a e.g. if we assign We need to do the same thing to calculate the accurate sum size of the "garbage" entries. I wonder if I am correct and if this approach is expected. BTW, there's a typo: there are 6 * 8MB entries in the figure. I think there should be only 4 of them. |
hi @pavelkalinnikov @CaojiamingAlan |
@CaojiamingAlan Yes, I think you're correct, some kind of
@Aditya-Sood Still open, would you like to take a stab at it? @CaojiamingAlan Are you working on this? |
@pavelkalinnikov yes I'd like to try if @CaojiamingAlan is unable to reply by Sunday I'll get started |
hi @pavelkalinnikov, I'll get started on this |
When entries exit the
unstable
log structureraft/log_unstable.go
Line 156 in 3e6cb62
it shrinks itself based on the entries slice length and cap:
raft/log_unstable.go
Lines 162 to 179 in 3e6cb62
Firstly, the
len(u.entries)*lenMultiple < cap(u.entries)
does not necessarily do a right thing. After shifting the slice start inu.entries = u.entries[num:]
, the firstnum
entries "disappear" fromlen
andcap
of this slice: https://go.dev/play/p/ao1xaL0Edla.So it's possible that
len >= cap/2
all the time, even though only a small portion of the backing slice is used. We should take thecap
of the entire backing slice into account instead.Secondly, this heuristic only takes
len/cap
of the slice into consideration, but not the size of the entries referenced by it. It could be beneficial to, in addition, shrink the slice if the sum size of the "garbage" entries is more than a half. This would keep the overall memory consumption more controlled. Doing so would require maintaining a running sum of entry sizes in theunstable
struct (which we do anyway in other places for rate limiting purposes).The same heuristic could be applied in MemoryStorage.Compact method to smooth out the allocation/copying cost of log truncations. Frequent truncations may incur a quadratic cost here, while the heuristic allows capping it at O(N).
The text was updated successfully, but these errors were encountered: