Exhaustive list of data structures here.
import cats.collection._
List of data structures: BitSet, Dequeue, Diet, Discrete, DisjointSets, Heap, ISet, AvlMap, PairingHeap, PartiallyOrderedSet, Range, AvlSet, TreeList
A Bitset is a specialized type of set that tracks the Int
values
it contains: for each integer value, a BitSet uses a single bit to
track whether the value is present (1) or absent (0). Bitsets are
often sparse, since "missing" bits can be assumed to be zero.
Unlike scala's default immutable this BitSet does not do a full copy on each added value.
Interally the implementation is a tree. Each leaf uses an Array[Long] value to hold up to 2048 bits, and each branch uses an Array[BitSet] to hold up to 32 subtrees (null subtrees are treated as empty).
Bitset treats the values it stores as 32-bit unsigned values, which
is relevant to the internal addressing methods as well as the order
used by iterator
.
The benchmarks suggest this bitset is MUCH faster than Scala's built-in bitset for cases where you may need many modifications and merges, (for example in a BloomFilter).
Binary tree based on Vladimir Kostyukov. The key stored in each node is either >= or <= to the keys in the node's children according to some total order.
Operations: add, remove (the min value), min, toList.
Double-ended queue. Offers a constant time cons, snoc and amortized constant time uncons and unsnoc'.
Operations: cons(add on left), snoc(add on right), uncons, unsnoc.
Diet is a binary searh tree where each node contains a set of continuous values. The stored type must have a total order.
Best case scenario: no holes in the stored set, so the tree contains only one node (a single interval). In this case, inserting, deleting, finding O(1).
Worst case scenario: hole of size one between each node and each node in fact a set on size one. The operations require O(n).
Operations: add, addRange, remove, removeRange, contains, constainsRange, (-) remove range, (&) intersection, (|) and (++) union, min, max, intervals, toList.
Haskell implementation here
succ(x): successor value of x
pred(x): predecessor value of x
adj(i, j): succ(i) == j
An Union-Find structure is a set of sets where the intersection of any two sets is empty.
DS = {S0, S1, ..., Sn} ∀i,j ∈ {0,1,...,n}, Sij = ∅
Fast unions ( average O(1)). Perfect tool for clustering algorithm such as calculating the connected components in a graph.
Operations: add, union (fusion two disjoin sets), find, toSets.
Each element starts as a disjoint set. A set with two or more elements is always the result of one or several union operations. A multi-element set is composed of set of just one element called components. The components are linked.
Forest of tree-sets.
Reduce the depth of each tree (rank) representing a set (the lower the tree height, the fewer operations are needed to find one element). Two heuristics:
- Union by rank: avoid incrementing maxh
- Path compression
Set is a tree-based set which stores Orderable elements in AVL balanced binary tree.
Membership in a ISet is defined by a predicate function that returns true for any value which is member of the set.
Range [x, y] that can be generate by using discrete operations Discrete.
Operations: start, end, toList, contains(value), contains(range), reverse, (-) difference with Range, toString.