Skip to content

mv flexcache

Matthew Von-Maszewski edited this page Sep 24, 2013 · 29 revisions

Status

  • merged to master
  • code complete September 22, 2013
  • development started

History / Context

leveldb contains two caches per database (Riak vnode): file cache and block cache. The user establishes the size of each cache via the Options structure passed to leveldb upon database open. The two cache sizes are then fixed while the database is open. With Riak, the two cache sizes are actually fixed across all databases until Riak is restarted.

Riak is a very dynamic user of leveldb. Riak can have a highly variable number of databases open. And Riak can run for long periods of time, using hundreds to thousands of leveldb .sst table files. Both of these dynamics lead to suboptimal sizing of the two caches since sizes must conform to the worst case not the most likely, or to the mathematically estimated scenarios not the actual runtime usage.

This branch automates the sizing / resizing of the file cache and block cache during operation while maximizing total memory usage across a dynamic number databases. The variable items considered:

  • varying number of open databases (vnodes) as Riak performs handoffs
  • prioritized memory demands of file versus block cache since cost of a miss in file cache is much greater than cost of block cache miss
  • file cache entries can become stale (not actively accessed) and therefore reduce the block cache size available to active files
  • Riak has two tiers of databases, user databases (vnodes) and internal databases (active anti-entropy management, AAE), that should get different allocations of memory
  • the total memory available to virtual servers can change dynamically in some VM environments and it would be helpful if leveldb could adjust without total restart

Branch description

This branch adds code to keep track of the number of databases open (vnodes) and whether those databases are for user data or internal data (active anti-entropy feature). Each Open() operation adds the new database to an stl::list and informs the master cache object, FlexCache, of the total memory allocation available to all databases. The FlexCache object calls all existing database objects to inform them that the per database allocation has changed. If internal databases exist, they split a 20% allocation of all memory. User database either split 80% if internal databases exist, or 100% if no internal databases exist (AAE disabled).

Each database object DBImpl contains a master cache object DoubleCache. DoubleCache manages both the file cache and the block cache. Double cache gives memory priority to the file cache. But all memory not used by the file cache is available to the block cache. The block cache is guaranteed a 2Mbyte minimum size.

The Open API can fail if the new database object would receive less than 18Mbytes of shared cache.

Two new unit tests support the branch's changes: cache2_test and flexcache_test.

Clone this wiki locally