Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Lower memory requirements for cached pressed pages
ETag Caching was introduced for Facia `PressedPage` JSON downloading with #26338 in order to improve scalability and address #26335, but a limiting factor was the number of `PressedPage` objects that could be stored in the cache. With a max `PressedPage` size of 22MB and a memory budget of 4GB, a cautious max cache size limit of only 180 `PressedPage` objects was set. As a result, the cache hit rate was relatively low, and we saw elevated GC, probably because of object continually being evicted out of the small cache: #26338 (comment) The change in this new commit dramatically reduces the combined size of the `PressedPage` objects held in memory, taking the average retained size per `PressedPage` from 4MB to 0.5MB (based on a sample of 125 `PressedPage` objects held in memory at the same time). It does this by deduplicating the `Tag` objects held by the `PressedPage`s. Previously, as the `Tag`s for different `PressedPage`s were deserialised from JSON, many identical tags would created over and over again, and held in memory. After dedeuplication, those different `PressedPage`s will all reference the _same_ `Tag` object for a given tag. The deduplication is done as the `Tag`s are deserialised - a new cache (gotta love caches!) holds `Tag`s keyed by their hashcode and tag id, and if a new `Tag` is created with a matching key, it's thrown away, and the old one is used instead. Thus we end up with just one instance of that `Tag`, instead of many duplicated ones. See also: * https://en.wikipedia.org/wiki/String_interning - a similar technique used by Java for Strings: https://www.geeksforgeeks.org/interning-of-string/
- Loading branch information