-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LRU cache utility #1090
base: master
Are you sure you want to change the base?
LRU cache utility #1090
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1090 +/- ##
==========================================
- Coverage 94.31% 93.98% -0.33%
==========================================
Files 159 162 +3
Lines 17221 17491 +270
==========================================
+ Hits 16242 16439 +197
- Misses 979 1052 +73
Flags with carried forward coverage won't be shown. Click here to find out more.
|
013084f
to
738e0d7
Compare
Adds generic single-threaded implementation of the least recently used (LRU) cache. This is to be used to cache the code and code analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sadly can't complain about anything. Let's ask Copilot then . :)
|
||
using LRUList = std::list<LRUEntry>; | ||
using LRUIterator = typename LRUList::iterator; | ||
using Map = std::unordered_map<Key, LRUIterator>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible/better to store values in the map entries? Then there'd be just one link LRUEntry => MapNode
.
Now it's double-direction: LRUEntry::key => MapNode::key
and MapNode::value -> LRUEntry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see a comment below that there's no performance difference. But it seems more convoluted to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is possible. Then the LRU list has only the pointer/reference/iterator to the map. And the map is Key => (LRUIterator, Value).
I'm kind of on your side here, but because the performance difference was negligible I decided to keep the "classic" layout for now. There is also small nuance: you cannot have std::list<const Key&>
so you need to wrap the reference or use a pointer.
This is to be revisited in case we are going to use intrusive list and/or map iterators.
Map map_; | ||
|
||
/// Marks an element as the most recently used by moving it to the back of the LRU list. | ||
void move_to_back(LRUIterator it) noexcept { lru_list_.splice(lru_list_.end(), lru_list_, it); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the name kinda leaks internal representation (order of elements in the list). Maybe better something like touch()
, use()
, access()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I didn't notice it's a private method, then it's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I thought about it, e.g. naming it bump_usage()
. In the end I decided not be because we also use lru_list_
directly in other places. E.g. we should also replace ru_list_.emplace
with something like new_use()
?
The main motivation was to wrap splice()
which takes me usually some time to figure out.
auto node = map_.extract(lru_it->key); | ||
swap(node.key(), key); | ||
if (auto [it, inserted, node2] = map_.insert(std::move(node)); !inserted) | ||
{ | ||
// Failed re-insertion means the element with the new key is already in the cache. | ||
// Rollback the eviction by re-inserting the node with original key back. | ||
swap(key, node2.key()); | ||
map_.insert(std::move(node2)); | ||
|
||
// Returned iterator points to the element matching the key | ||
// which value must be updated. | ||
lru_it = it->second; | ||
} | ||
lru_it->value = std::move(value); // Replace/update the value. | ||
move_to_back(lru_it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Juggling of keys is quite confusing, I'd maybe add more comments like
auto node = map_.extract(lru_it->key); | |
swap(node.key(), key); | |
if (auto [it, inserted, node2] = map_.insert(std::move(node)); !inserted) | |
{ | |
// Failed re-insertion means the element with the new key is already in the cache. | |
// Rollback the eviction by re-inserting the node with original key back. | |
swap(key, node2.key()); | |
map_.insert(std::move(node2)); | |
// Returned iterator points to the element matching the key | |
// which value must be updated. | |
lru_it = it->second; | |
} | |
lru_it->value = std::move(value); // Replace/update the value. | |
move_to_back(lru_it); | |
auto node = map_.extract(lru_it->key); // node.key() is LRU key | |
swap(node.key(), key); // node.key() is new key, key is LRU key | |
if (auto [it, inserted, node2] = map_.insert(std::move(node)); !inserted) | |
{ | |
// node2.key() is new key (already in cache) | |
// Failed re-insertion means the element with the new key is already in the cache. | |
// Rollback the eviction by re-inserting the node with original key back. | |
swap(key, node2.key()); // key is new key (already in cache), node2.key() is LRU key | |
map_.insert(std::move(node2)); | |
// Returned iterator points to the element matching the key | |
// which value must be updated. | |
lru_it = it->second; | |
} | |
lru_it->value = std::move(value); // Replace/update the value. | |
move_to_back(lru_it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this value update for existing key feature is quite annoying to implement. And in many cases it is never used... I will try to improve the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should mention O(1) get/put complexity.
Map map_; | ||
|
||
/// Marks an element as the most recently used by moving it to the back of the LRU list. | ||
void move_to_back(LRUIterator it) noexcept { lru_list_.splice(lru_list_.end(), lru_list_, it); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I thought about it, e.g. naming it bump_usage()
. In the end I decided not be because we also use lru_list_
directly in other places. E.g. we should also replace ru_list_.emplace
with something like new_use()
?
The main motivation was to wrap splice()
which takes me usually some time to figure out.
auto node = map_.extract(lru_it->key); | ||
swap(node.key(), key); | ||
if (auto [it, inserted, node2] = map_.insert(std::move(node)); !inserted) | ||
{ | ||
// Failed re-insertion means the element with the new key is already in the cache. | ||
// Rollback the eviction by re-inserting the node with original key back. | ||
swap(key, node2.key()); | ||
map_.insert(std::move(node2)); | ||
|
||
// Returned iterator points to the element matching the key | ||
// which value must be updated. | ||
lru_it = it->second; | ||
} | ||
lru_it->value = std::move(value); // Replace/update the value. | ||
move_to_back(lru_it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this value update for existing key feature is quite annoying to implement. And in many cases it is never used... I will try to improve the comments.
|
||
using LRUList = std::list<LRUEntry>; | ||
using LRUIterator = typename LRUList::iterator; | ||
using Map = std::unordered_map<Key, LRUIterator>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is possible. Then the LRU list has only the pointer/reference/iterator to the map. And the map is Key => (LRUIterator, Value).
I'm kind of on your side here, but because the performance difference was negligible I decided to keep the "classic" layout for now. There is also small nuance: you cannot have std::list<const Key&>
so you need to wrap the reference or use a pointer.
This is to be revisited in case we are going to use intrusive list and/or map iterators.
Adds generic single-threaded implementation of
the least recently used (LRU) cache.
This is to be used to cache the code and code analysis.