Skip to content

Ascending bitset index

the-very requested to merge github/fork/ya-goodfella/ascending_bitset_index into master

Created by: ya-goodfella

Introducing ascending bit set implementation for IndexToIndexMultiMap.

It has O(1) complexity for all document retrieving operations (e.g. getFrom, getTo, getBetween and so on).

Benchmarks shows huge performance improvement on set with 8192 keys over list based and plain bit set implementations (except for single get operation, which is 2x times slower comparing to plain bit set implementation)

Benchmark                                         Mode  Cnt         Score         Error  Units
IndexToIndexMultiMapBenchmark.bitSetFrom         thrpt    5       605,149 ±      28,475  ops/s
IndexToIndexMultiMapBenchmark.listSetFrom        thrpt    5     32295,509 ±    1360,385  ops/s
IndexToIndexMultiMapBenchmark.ascendingSetFrom   thrpt    5   1715115,552 ±   61549,874  ops/s
IndexToIndexMultiMapBenchmark.bitSetUntil        thrpt    5       513,266 ±     244,579  ops/s
IndexToIndexMultiMapBenchmark.listSetUntil       thrpt    5     33561,396 ±    1423,644  ops/s
IndexToIndexMultiMapBenchmark.ascendingSetUntil  thrpt    5   2821149,446 ±  525605,807  ops/s
IndexToIndexMultiMapBenchmark.ascendingSetGet    thrpt    5    852227,919 ±   31605,255  ops/s
IndexToIndexMultiMapBenchmark.bitSetGet          thrpt    5   1687927,367 ±  303933,743  ops/s
IndexToIndexMultiMapBenchmark.listSetGet         thrpt    5  13252488,403 ± 1801286,259  ops/s

Benchmark code included.

Main idea of this implementation is store not individual bit sets for corresponding keys, but all documents that have keys less than this key. Additionally we add an extra bit set that contains all documents that was associated with this index (e.g non-null values).

The only caveat is that this implementation cannot be used with multi-value documents (when one document assigns more than one key for this index).

Merge request reports