From 798e0a59eaf8edb2c408f3552246fedfda04cf66 Mon Sep 17 00:00:00 2001 From: Lorina Poland Date: Thu, 9 Nov 2023 12:26:24 -0800 Subject: [PATCH] remove ucs from trunk --- .../managing/operating/compaction/ucs.adoc | 132 ------------------ 1 file changed, 132 deletions(-) delete mode 100644 doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc diff --git a/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc b/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc deleted file mode 100644 index ec0158f863bc..000000000000 --- a/doc/modules/cassandra/pages/managing/operating/compaction/ucs.adoc +++ /dev/null @@ -1,132 +0,0 @@ -= Unified Compaction Strategy (UCS) - -[[ucs]] - -The `UnifiedCompactionStrategy (UCS)` is recommended for any workloads, but especially for vertically scaled nodes. -Consider using UCS if using the legacy strategies of STCS or LCS is not optimal. -UCS behaves similarly, but better than either STCS or LCS, utilizing sharding (partitioned data compacted in parallel). -Only in the case of time-series data is UCS not an optimal choice. -UCS behavior is not meant to mirror TWCS's append-only time series data compaction. - -Provides configurable read, write amplification guarantees. -Parameters are reconfigurable at any time. UCS reduces disk space overhead and is stateless. - -LLP - FIX FROM HERE!!! - -Use UnifiedCompactionStrategy (UCS) to adjust the balance between the number of SSTables that need to be consulted to serve a read (in other words, the database’s read amplification, RA) versus the number of times a piece of data must be rewritten during its lifetime (in other words, the database’s write amplification, WA). Configure UCS with a single scaling parameter which determines the compaction mode (positive values select tiered mode and negative values select leveled mode) and the compaction hierarchy’s fan factor F. The strategy organizes data in a hierarchy of levels defined by the size of SSTables. Each level accepts SSTables that are between a given size and F times that size. In leveled mode, compaction is triggered immediately when more than one SSTable is present on a level. In tiered mode, compaction is triggered as the threshold of F times a given size is reached or exceeded. The results vary per mode. At each level in leveled mode, UCS recompacts data multiple times with the goal to maintain one SSTable per level. In tiered mode UCS compacts only once but maintains multiple SSTables. - -Change the behavior of UCS to be more like STCS or LCS with one scaling factor (W) parameter. This may be set per level. Because levels can be set to different values of W, levels can behave differently. For example, level zero (0) could behave like STCS but higher levels could increasingly behave like LCS. - -ucs scaling factor W -Figure 2. UCS scaling factor W -The W scaling factor is an integer parameter that determines read (RA) and write (WA) amplification bounds and STCS versus LCS behavior. The W factor is set accordingly: - -W < 0 = LCS where W = -8 emulates an improved LCS mode (leveled compactions with high WA, but low RA). - -W = 0 - middle ground (leveled and tiered compactions behave identically). - -W > 0 ⇒ STCS where W = 2 emulates an improved STCS mode (tiered compactions with low WA, but high RA) - -The default is W = 2 for all levels. UCS tries not to recompact on a scaling factor change. Set incremental changes to minimize the impact. - -The parameters of UCS can be changed at any time. For example, an operator may decide to: - -decrease the scaling parameter, lowering the read amplification at the expense of more complex writes when a certain table is read-heavy and could benefit from reduced latencies. - -increase the scaling parameter, reducing the write amplification, when it has been identified that compaction cannot keep up with the number of writes to a table. - -Any such change only initiates compactions that are necessary to bring the hierarchy in a state compatible with the new configuration. Any additional work already done (for example, when switching from negative parameter to positive), is advantageous and incorporated. - -In addition, UCS splits data on specific token boundaries when the data exceeds a set size and forms independent compaction hierarchies in each shard. This reduces the size of the largest SSTables in the node. For example, instead of one 1-TB SSTable there are 10 shards, each with 100-GB. Sharding also permits UCS to execute compactions in each shard independently and in parallel, which is crucial to keep up with the demands of high-density nodes. UCS can apply whole-table expiration, which proves useful for time-series data with time-to-live constraints. - -UCS selection of SSTable levels is more stable and predictable than that of SizeTieredCompactionStrategy (STCS). STCS groups SSTables by size, whereas UCS groups by timestamp. UCS efficiently tracks time order and whole table expiration. - -UCS and LCS compaction strategies end with similar results. However, UCS handles the problem of space amplification by sharding on specific token boundaries. LCS splits SSTables based on a fixed size with boundaries usually falling inside SSTables on the next level, kicking off compaction more frequently than necessary. Therefore UCS aids with tight write amplification control. - -Parameters control the choice of different ratios of read and write amplification. Choose options that favor leveled compaction to improve reads at the expense of writes, or tiered compaction that favor writes at the expense of reads. Contact your DataStax account team for details. - - -// alleviates some of the read operation issues with STCS. -// This strategy works with a series of levels, where each level contains a set of SSTables. -// When data in memtables is flushed, SSTables are written in the first level (L0), where SSTables are not guaranteed to be non-overlapping. -// LCS compaction merges these first SSTables with larger SSTables in level L1. -// Each level is by default 10x the size of the previous one. -// Once an SSTable is written to L1 or higher, the SSTable is guaranteed to be non-overlapping with other SSTables in the same level. -// If a read operation needs to access a row, it will only need to look at one SSTable per level. - -// To accomplish compaction, all overlapping SSTables are merged into a new SSTable in the next level. -// For L0 -> L1 compactions, we almost always need to include all L1 SSTables since most L0 SSTables cover the full range of partitions. -// LCS compacts SSTables from one level to the next, writing partitions to fit a defined SSTable size. -// In addition, each level has a prescribed size, so that compaction will be triggered when a level reaches its size limit. -// Creating new SSTables in one level can trigger compaction in the next level, and so on, until all levels have been compacted based on the settings. - -// There is a failsafe if too many SSTables reads are being done in the L0 level. -// An STCS compaction will be triggered in L0 if there are more than 32 SSTables in L0. -// This compaction quickly merges SSTables out of L0, and into L1, where they will be compacted to non-overlapping SSTables. - -// LCS is not as disk hungry as STCS, needing only approximately 10% of disk to execute, but it is more IO and CPU intensive. -// For ongoing minor compactions in a read-heavy workload, the amount of compaction is reasonable. -// It is not a good choice for write-heavy workloads, though, because it will cause a lot of disk IO and CPU usage. -// Major compactions are not recommended for LCS. - -// == Bootstrapping - -// During bootstrapping, SSTables are streamed from other nodes. -// Because many SSTables will be both flushed from the new writes to memtables, as well as streaming from a remote note, the new node will have many SSTables in L0. -// To avoid a collision of the flushing and streaming SSTables, only STCS in L0 is executed until the bootstrapping is complete. - -// == Starved sstables - -// If the leveling is not optimal, LCS can end up with starved sstables. -// High level SSTables can be stranded and not compacted, because SSTables in lower levels are not getting merged and compacted. -// For example, this situation can make it impossible for lower levels to drop tombstones. -// If these starved SSTables are not resolved within a defined number of compaction rounds, they will be included in other compactions. -// This situation generally occurs if a user lowers the `sstable_size` setting. - -// include::cassandra:partial$default-compaction-strategy.adoc[] - -// [[lcs_options]] -// == LCS options - -// [cols="1,2"] -// |=== -// | Subproperty | Description - -// | enabled -// | Enables background compaction. -// Default value: true -// // See Enabling and disabling background compaction. - -// | fanout_size -// | The target size of levels increases by this `fanout_size` multiplier. -// You can reduce the space amplification by tuning this option. -// Default: 10 - -// | log_all -// | Activates advanced logging for the entire cluster. -// Default value: false - -// | sstable_size_in_mb -// The target size for SSTables. -// Although SSTable sizes should be less or equal to sstable_size_in_mb, it is possible that compaction could produce a larger SSTable during compaction. -// This occurs when data for a given partition key is exceptionally large. -// The {cassandra} database does not split the data into two SSTables. -// Default: 160 - -// | tombstone_compaction_interval -// | The minimum number of seconds after which an SSTable is created before {cassandra} considers the SSTable for tombstone compaction. -// An SSTable is eligible for tombstone compaction if the table exceeds the `tombstone_threshold` ratio. -// Default value: 86400 - -// | tombstone_threshold -// | The ratio of garbage-collectable tombstones to all contained columns. -// If the ratio exceeds this limit, {cassandra} starts compaction on that table alone, to purge the tombstones. -// Default value: 0.2 - -// | unchecked_tombstone_compaction -// | If set to `true`, allows {cassandra} to run tombstone compaction without pre-checking which tables are eligible for this operation. -// Even without this pre-check, {cassandra} checks an SSTable to make sure it is safe to drop tombstones. -// Default value: false -// |=== - -// LCS also supports a startup option, `-Dcassandra.disable_stcs_in_l0=true` which disables STCS in L0. \ No newline at end of file