Skip to content

Instantly share code, notes, and snippets.

@hyc
Last active July 20, 2017 11:19
Show Gist options
  • Select an option

  • Save hyc/6e00907a8a64be805fb5 to your computer and use it in GitHub Desktop.

Select an option

Save hyc/6e00907a8a64be805fb5 to your computer and use it in GitHub Desktop.
More fun with InfluxDB
I believe this is directly comparable to the results published here
http://influxdb.com/blog/2014/06/20/leveldb_vs_rocksdb_vs_hyperleveldb_vs_lmdb_performance.html
My laptop has 8GB of RAM but I pared it down to 4GB by turning off swap and creating a large enough file
in tmpfs to drop free RAM down to 4GB.
This is the code prior to using Sorted Duplicates.
The RocksDB performance is amazingly poor.
violino:/home/software/influxdb> /usr/bin/time -v ./benchmark-storage -path=/home/test/db -points=100000000 -series=500000
################ Benchmarking: lmdb
Writing 100000000 points in batches of 1000 points took 10m0.973944912s (6.009739 microsecond per point)
Querying 100000000 points took 4m0.062210717s (2.400622 microseconds per point)
Size: 6.1G
Took 1m19.017894891s to delete 50000000 points
Took 2.095us to compact
Querying 50000000 points took 1m29.304574146s (1.786091 microseconds per point)
Size: 7.6G
Writing 50000000 points in batches of 1000 points took 9m36.931839789s (11.538637 microsecond per point)
Size: 7.7G
################ Benchmarking: leveldb
Writing 100000000 points in batches of 1000 points took 39m50.903262204s (23.909033 microsecond per point)
Querying 100000000 points took 2m49.339779425s (1.693398 microseconds per point)
Size: 2.7G
Took 5m48.831738377s to delete 50000000 points
Took 6m17.357548286s to compact
Querying 50000000 points took 1m0.168453865s (1.203369 microseconds per point)
Size: 1.4G
Writing 50000000 points in batches of 1000 points took 16m14.040395323s (19.480808 microsecond per point)
Size: 2.6G
################ Benchmarking: rocksdb
Writing 100000000 points in batches of 1000 points took 3h25m10.762258086s (123.107623 microsecond per point)
Querying 100000000 points took 2m26.217626808s (1.462176 microseconds per point)
Size: 37G
Took 8m45.677135051s to delete 50000000 points
Took 2m55.372818028s to compact
Querying 50000000 points took 1m1.570714964s (1.231414 microseconds per point)
Size: 37G
Writing 50000000 points in batches of 1000 points took 2h1m51.42641092s (146.228528 microsecond per point)
Size: 58G
################ Benchmarking: hyperleveldb
Writing 100000000 points in batches of 1000 points took 9m9.924859094s (5.499249 microsecond per point)
Querying 100000000 points took 9m32.667573668s (5.726676 microseconds per point)
Size: 3.3G
Took 5m47.830141963s to delete 50000000 points
Took 6m39.712762331s to compact
Querying 50000000 points took 1m22.704782776s (1.654096 microseconds per point)
Size: 1.6G
Writing 50000000 points in batches of 1000 points took 4m24.807726459s (5.296155 microsecond per point)
Size: 3.5G
Command being timed: "./benchmark-storage -path=/home/test/db -points=100000000 -series=500000"
User time (seconds): 22667.93
System time (seconds): 6365.04
Percent of CPU this job got: 98%
Elapsed (wall clock) time (h:mm:ss or m:ss): 8:12:30
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 27072656
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1177869
Minor (reclaiming a frame) page faults: 90486770
Voluntary context switches: 669563529
Involuntary context switches: 14246002
Swaps: 0
File system inputs: 63595816
File system outputs: 590122424
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
This was another run on the same machine, same code. Strangely, RocksDB's result is completely different this time, crashing the program.
violino:/home/software/influxdb> /usr/bin/time -v ./benchmark-storage -path=/home/test/db -points=100000000 -series=500000
################ Benchmarking: lmdb
Writing 100000000 points in batches of 1000 points took 12m14.22774633s (7.342277 microsecond per point)
Querying 100000000 points took 6m1.96092097s (3.619609 microseconds per point)
Size: 7.6G
Took 6m23.96630963s to delete 50000000 points
Took 1.048us to compact
Querying 50000000 points took 4m0.083501501s (4.801670 microseconds per point)
Size: 7.6G
Writing 50000000 points in batches of 1000 points took 1h28m28.45283235s (106.169057 microsecond per point)
Size: 8.2G
################ Benchmarking: leveldb
Writing 100000000 points in batches of 1000 points took 39m32.39700747s (23.723970 microsecond per point)
Querying 100000000 points took 3m6.89910029s (1.868991 microseconds per point)
Size: 2.7G
Took 5m39.404872895s to delete 50000000 points
Took 6m14.918991943s to compact
Querying 50000000 points took 1m0.488077474s (1.209762 microseconds per point)
Size: 1.4G
Writing 50000000 points in batches of 1000 points took 16m28.047675968s (19.760954 microsecond per point)
Size: 2.6G
################ Benchmarking: rocksdb
Writing 100000000 points in batches of 1000 points took 3h45m57.166233904s (135.571662 microsecond per point)
Querying 100000000 points took 3m3.470915689s (1.834709 microseconds per point)
Size: 41G
Took 8m33.237626533s to delete 50000000 points
Took 3m47.826396787s to compact
Querying 50000000 points took 51.101206202s (1.022024 microseconds per point)
Size: 41G
Writing 50000000 points in batches of 1000 points took 2h55m7.684545292s (210.153691 microsecond per point)
panic: exit status 1
goroutine 1 [running]:
runtime.panic(0x7cfc80, 0xc2100d3778)
/usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
main.getSize(0xc210117b80, 0x1a, 0x1a, 0x4)
/home/software/influxdb/src/tools/benchmark-storage/main.go:54 +0x130
main.benchmarkDbCommon(0x7f9114265198, 0xc21001fc00, 0x5f5e100, 0x3e8, 0x7a120, ...)
/home/software/influxdb/src/tools/benchmark-storage/main.go:97 +0x811
main.benchmark(0x7eb180, 0x7, 0x5f5e100, 0x3e8, 0x7a120, ...)
/home/software/influxdb/src/tools/benchmark-storage/main.go:47 +0x269
main.main()
/home/software/influxdb/src/tools/benchmark-storage/main.go:32 +0x454
goroutine 3 [chan receive]:
code.google.com/p/log4go.ConsoleLogWriter.run(0xc2100492c0, 0x7f9114255fe8, 0xc210000008)
/home/software/influxdb/src/code.google.com/p/log4go/termlog.go:27 +0x60
created by code.google.com/p/log4go.NewConsoleLogWriter
/home/software/influxdb/src/code.google.com/p/log4go/termlog.go:19 +0x67
goroutine 4 [syscall]:
runtime.goexit()
/usr/local/go/src/pkg/runtime/proc.c:1394
goroutine 6 [finalizer wait]:
runtime.park(0x5d8210, 0xc68380, 0xc571a8)
/usr/local/go/src/pkg/runtime/proc.c:1342 +0x66
runfinq()
/usr/local/go/src/pkg/runtime/mgc0.c:2279 +0x84
runtime.goexit()
/usr/local/go/src/pkg/runtime/proc.c:1394
Command exited with non-zero status 2
Command being timed: "./benchmark-storage -path=/home/test/db -points=100000000 -series=500000"
User time (seconds): 21330.49
System time (seconds): 8025.84
Percent of CPU this job got: 78%
Elapsed (wall clock) time (h:mm:ss or m:ss): 10:21:55
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 17911936
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 22018495
Minor (reclaiming a frame) page faults: 52586768
Voluntary context switches: 884225141
Involuntary context switches: 12885920
Swaps: 0
File system inputs: 401123272
File system outputs: 747221368
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 2
This is the LMDB result using the Sorted Duplicates patches
https://github.com/influxdb/influxdb/pull/678
I don't have the RocksDB result yet, it will be several more hours before that finishes.
... updated, finally finished
violino:/home/software/influxdb> ./benchmark-storage -path=/home/test/db -points=100000000 -series=500000
################ Benchmarking: lmdb
Writing 100000000 points in batches of 1000 points took 9m55.060557828s (5.950606 microsecond per point)
Querying 100000000 points took 1m26.153283998s (0.861533 microseconds per point)
Size: 4.0G
Took 1m11.705748913s to delete 50000000 points
Took 1.257us to compact
Querying 50000000 points took 43.994534804s (0.879891 microseconds per point)
Size: 4.0G
Writing 50000000 points in batches of 1000 points took 5m32.398039417s (6.647961 microsecond per point)
Size: 5.9G
################ Benchmarking: leveldb
Writing 100000000 points in batches of 1000 points took 40m8.701727125s (24.087017 microsecond per point)
Querying 100000000 points took 3m39.413232183s (2.194132 microseconds per point)
Size: 2.7G
Took 17m48.421502672s to delete 50000000 points
Took 6m13.689504673s to compact
Querying 50000000 points took 1m1.125226854s (1.222505 microseconds per point)
Size: 1.4G
Writing 50000000 points in batches of 1000 points took 16m21.570047473s (19.631401 microsecond per point)
Size: 2.6G
################ Benchmarking: rocksdb
Writing 100000000 points in batches of 1000 points took 3h10m25.346725469s (114.253467 microsecond per point)
Querying 100000000 points took 2m26.002405473s (1.460024 microseconds per point)
Size: 35G
Took 16m40.54319908s to delete 50000000 points
Took 3m3.3481798s to compact
Querying 50000000 points took 58.448312524s (1.168966 microseconds per point)
Size: 36G
Writing 50000000 points in batches of 1000 points took 2h11m27.871520367s (157.757430 microsecond per point)
Size: 59G
################ Benchmarking: hyperleveldb
Writing 100000000 points in batches of 1000 points took 9m10.276314813s (5.502763 microsecond per point)
Querying 100000000 points took 12m8.949611018s (7.289496 microseconds per point)
Size: 3.3G
Took 5m11.934801159s to delete 50000000 points
Took 10m31.038632478s to compact
Querying 50000000 points took 1m24.106956728s (1.682139 microseconds per point)
Size: 1.6G
Writing 50000000 points in batches of 1000 points took 4m30.184909667s (5.403698 microsecond per point)
Size: 3.4G
LMDB doesn't have the plethora of complex tuning APIs that other databases do, but it *does* have some worthwhile
data access features that other databases don't. Learning to use them correctly is well worth the trouble.
violino:~/OD/mdb/libraries/liblmdb> ls -l /home/test/db/
total 119
drwxr-xr-x 2 hyc hyc 18904 Jun 22 22:31 test-hyperleveldb
drwxr-xr-x 2 hyc hyc 43128 Jun 22 15:56 test-leveldb
drwxr-xr-x 2 hyc hyc 96 Jun 22 14:01 test-lmdb
drwxr-xr-x 2 hyc hyc 60112 Jun 22 21:43 test-rocksdb
violino:~/OD/mdb/libraries/liblmdb> du !$
du /home/test/db/
3568158 /home/test/db/test-hyperleveldb
6084152 /home/test/db/test-lmdb
61190722 /home/test/db/test-rocksdb
2689903 /home/test/db/test-leveldb
73532934 /home/test/db/
The data files were not touched after running the test. You can see that LevelDB didn't finish until almost 2 hours after the LMDB test ended, the RocksDB test ended almost 6 hours after LevelDB ended, and HyperLevelDB took about 45 minutes after that.
Added a new -c (compact) option to mdb_copy, which copies the DB sequentially, omitting freed/deleted pages.
Starting with the usual run, interrupt it after half the records are deleted:
violino:/home/software/influxdb> /usr/bin/time -v ./benchmark-storage -path=/home/test/db -points=100000000 -series=500000
################ Benchmarking: lmdb
Writing 100000000 points in batches of 1000 points took 10m18.8538945s (6.188539 microsecond per point)
Querying 100000000 points took 1m28.581634191s (0.885816 microseconds per point)
Size: 4.0G
Took 1m13.593047399s to delete 50000000 points
Took 1.118us to compact
^CCommand exited with non-zero status 2
Command being timed: "./benchmark-storage -path=/home/test/db -points=100000000 -series=500000"
User time (seconds): 845.65
System time (seconds): 74.44
Percent of CPU this job got: 104%
Elapsed (wall clock) time (h:mm:ss or m:ss): 14:42.83
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 16497568
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 51
Minor (reclaiming a frame) page faults: 7521009
Voluntary context switches: 6036856
Involuntary context switches: 87408
Swaps: 0
File system inputs: 12152
File system outputs: 60187944
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 2
Then check how compaction behaves:
violino:~/OD/mdb/libraries/liblmdb> /usr/bin/time -v ./mdb_copy -c /home/test/db/test-lmdb/ /home/test/db/x
Command being timed: "./mdb_copy -c /home/test/db/test-lmdb/ /home/test/db/x"
User time (seconds): 1.56
System time (seconds): 6.23
Percent of CPU this job got: 10%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:15.60
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 16255568
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 1016071
Voluntary context switches: 12714
Involuntary context switches: 1924
Swaps: 0
File system inputs: 600
File system outputs: 8141848
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
violino:~/OD/mdb/libraries/liblmdb> du /home/test/db
4065106 /home/test/db/x
4128788 /home/test/db/test-lmdb
8193894 /home/test/db
Doesn't make a huge difference in space, since only ~14,000 freed pages were in the DB to begin with:
violino:~/OD/mdb/libraries/liblmdb> ./mdb_stat -ef /home/test/db/test-lmdb/
Environment Info
Map address: (nil)
Map size: 10737418240
Page size: 4096
Max pages: 2621440
Number of pages used: 1029636
Last transaction ID: 600001
Max readers: 126
Number of readers used: 0
Freelist Status
Tree depth: 2
Branch pages: 1
Leaf pages: 40
Overflow pages: 0
Entries: 1395
Free pages: 14310
Status of Main DB
Tree depth: 1
Branch pages: 0
Leaf pages: 1
Overflow pages: 0
Entries: 50000000
violino:~/OD/mdb/libraries/liblmdb> ./mdb_stat -ef /home/test/db/x
Environment Info
Map address: (nil)
Map size: 10737418240
Page size: 4096
Max pages: 2621440
Number of pages used: 1015285
Last transaction ID: 1
Max readers: 126
Number of readers used: 0
Freelist Status
Tree depth: 0
Branch pages: 0
Leaf pages: 0
Overflow pages: 0
Entries: 0
Free pages: 0
Status of Main DB
Tree depth: 1
Branch pages: 0
Leaf pages: 1
Overflow pages: 0
Entries: 50000000
The test is a bit awkward here too, since it deletes the entries from the middle of the DB. If you were truly expiring records from a time-series database, you would delete from the head of the DB. Deleting in the middle like this leaves a lot of pages half full, instead of totally emptying/freeing pages.
@jvshahid
Copy link
Copy Markdown

Do you run the tests on SSD or spinning disk ?

@hyc
Copy link
Copy Markdown
Author

hyc commented Jun 23, 2014

This is on a Crucial M4 512GB SSD.

@nodtem66
Copy link
Copy Markdown

Thanks for sharing benchmarks. I'm looking for Influx optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment