Version 1.2.0
This release adds a new tool called permute
that re-orders (and possibly reverse-complement) the strings in an input (weighted) collection to minimize the number of runs in the abundances and, hence, optimize the encoding of the abundances.
The abundances are encoded in O(r)
space on top of the space for a SSHash dictionary, where r
is the number of runs (i.e., maximal substrings formed by a single abundance value) in the abundances.
The i
-th abundance in the sequence, corresponding to the k-mer of identifier i
, is retrieved in O(log r)
time.