Trie based index for key storage
Created by: ya-goodfella
Trie-based implementation for keys storage. Provides an average complexity O(|key|) for all operations with worst-case performance proportional to the depth of compressed trie. Unfortunately, it can be used only for filterable indexes, but it makes lookup operations much faster. Space requirments remained reasonable comparing to other implementations and depends on dataset:
100K random strings with const prefix:
fixed: 2500008 bytes
variable: 3300012 bytes
trie: 2707918 bytes
64K random 8 byte data:
fixed: 524296 bytes
variable: 1048588 bytes
trie: 1961066 bytes
~100K 4 byte integers from range [20000, 250000000):
fixed: 388180 bytes
variable: 1164528 bytes
trie: 1917660 bytes
Benchmarks below:
Benchmark Mode Cnt Score Error Units
ByteArraySortedSetBenchmarks.price_fixed thrpt 25 12,137 ± 1,851 ops/s
ByteArraySortedSetBenchmarks.price_trie thrpt 25 30,216 ± 1,428 ops/s
ByteArraySortedSetBenchmarks.price_variable thrpt 25 8,071 ± 0,871 ops/s
ByteArraySortedSetBenchmarks.random_8bytes_64K_fixed thrpt 25 18,333 ± 0,523 ops/s
ByteArraySortedSetBenchmarks.random_8bytes_64K_trie thrpt 25 46,772 ± 2,117 ops/s
ByteArraySortedSetBenchmarks.random_8bytes_64K_variable thrpt 25 10,669 ± 0,271 ops/s
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_fixed thrpt 25 5,767 ± 0,175 ops/s
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_trie thrpt 25 14,527 ± 0,671 ops/s
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_variable thrpt 25 4,304 ± 0,116 ops/s
ByteArraySortedSetBenchmarks.price_fixed avgt 25 0,077 ± 0,009 s/op
ByteArraySortedSetBenchmarks.price_trie avgt 25 0,035 ± 0,002 s/op
ByteArraySortedSetBenchmarks.price_variable avgt 25 0,101 ± 0,002 s/op
ByteArraySortedSetBenchmarks.random_8bytes_64K_fixed avgt 25 0,054 ± 0,001 s/op
ByteArraySortedSetBenchmarks.random_8bytes_64K_trie avgt 25 0,023 ± 0,001 s/op
ByteArraySortedSetBenchmarks.random_8bytes_64K_variable avgt 25 0,096 ± 0,002 s/op
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_fixed avgt 25 0,180 ± 0,003 s/op
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_trie avgt 25 0,066 ± 0,002 s/op
ByteArraySortedSetBenchmarks.random_prefixed_string_100K_variable avgt 25 0,221 ± 0,012 s/op
Additionally I fixed bug in LongArrayBitSet.arraySize() and made possible comparision of empty UnsignedByteArray/Buffer.