Skip to content

Instantly share code, notes, and snippets.

@hyc
Last active November 20, 2016 17:08
Show Gist options
  • Select an option

  • Save hyc/566e6f8313a06dda03ec to your computer and use it in GitHub Desktop.

Select an option

Save hyc/566e6f8313a06dda03ec to your computer and use it in GitHub Desktop.
LevelDB / LMDB readwhilewriting bench, 1.25TB DB, 256GB RAM, 48 cores, 32 reader threads, 1 writer. NVME SSD (2.7TB capacity)w/XFS 6 hour run
[root@localhost ~]# tail out-nvme-2000-LevelDB.txt
2015/08/19-20:47:11 ... thread 1: (200000,19200000) ops and (797.9,898.5) ops/second in (250.646291,21370.034489) seconds
2015/08/19-20:47:14 ... thread 17: (200000,19200000) ops and (802.3,898.3) ops/second in (249.287673,21372.921579) seconds
2015/08/19-20:47:15 ... thread 6: (200000,19200000) ops and (807.9,898.3) ops/second in (247.562653,21373.652192) seconds
writer : 306.279 micros/op 3264 ops/sec; (desired 0 ops/sec)
readwhilewriting : 34.868 micros/op 28679 ops/sec; (19351999 of 19351999 found)
Usr Sys % Wall RSS Major Minor Volun Invol In Out
57837.84 18798.47 353% 6:00:57 131144636 64664683 197163841 1028101967 4398131 33708875504 3820650328
2015/08/19-20:51:57
1417981924 /mnt/dbbench
1417981924 /mnt
[root@localhost ~]# tail out-nvme-2000-LMDB.txt
2015/08/20-11:48:28 ... thread 2: (200000,102400000) ops and (4661.3,4741.7) ops/second in (42.906458,21595.790945) seconds
2015/08/20-11:48:30 ... thread 1: (200000,102200000) ops and (5360.8,4732.0) ops/second in (37.307684,21597.594386) seconds
2015/08/20-11:48:30 ... thread 19: (200000,102400000) ops and (4617.6,4741.2) ops/second in (43.312105,21597.693198) seconds
writer : 265.575 micros/op 3765 ops/sec; (desired 0 ops/sec)
readwhilewriting : 6.601 micros/op 151494 ops/sec; (102212999 of 102212999 found)
Usr Sys % Wall RSS Major Minor Volun Invol In Out
15515.21 122134.06 636% 6:00:15 248947156 2722992947 110785665 4438783198 5835886 21783693672 681997632
2015/08/20-11:48:48
1304487440 /mnt/dbbench_mdb-1
1304487440 /mnt
@hyc
Copy link
Copy Markdown
Author

hyc commented Aug 20, 2015

LMDB writes ~15% faster than LevelDB
LMDB reads ~5x faster than LevelDB
LMDB used ~80% more CPU than LevelDB - in an in-memory test this would be a penalty, but here it shows LMDB spends less time in I/O wait than LevelDB
LMDB uses ~1/4 as much User mode CPU as LevelDB while getting > 5x more work done - LMDB itself is much simpler code than LevelDB.
LMDB uses ~6.5x more System mode CPU than LevelDB. Ultimately every read and write must use some quantity of kernel time, so with over 5x more work done, LMDB forces the kernel to work that much harder too.

The test was set to run for exactly 6 hours. LevelDB took an extra 57 seconds to shutdown, LMDB took an extra15 seconds - most of which is kernel time, unmapping the process memory.

LevelDB was configured to use 96GB cache, and the process size grew to 125GB. The rest of system memory was occupied by FS cache.
LMDB uses no cache and the process size grew to 237GB, which was the same as the FS cache.

LMDB incurred 42x more Major page faults than LevelDB. Since all of its read accesses are via a mmap, it is expected to trigger page faults.
LevelDB incurred 77% more Minor page faults than LMDB. These are pages that weren't in the process' address space yet, but were already present in FS cache. For LMDB this would only occur during process startup, with the pages being leftover in FS cache from the previous process that populated the DB. For LevelDB this may be partly from startup, but primarily shows the overlap between LevelDB's cache manager and the FS cache.

LMDB had 4.3x more Voluntary context switches than LevelDB. The only thing that can trigger this in LMDB is the write() and fsync() syscalls used to commit transaction data. This reflects the fact that LMDB's writes are in more of a random access pattern than LevelDB's. (If more of the writes were in a sequential pattern, they would be bigger writes but fewer of them. Fewer syscalls would cause fewer Voluntary context switches.)

LMDB had 32% more Involuntary context switches than LevelDB. This means it was making more complete use of its CPU timeslices than LevelDB.

LevelDB performed 54.7% more filesystem input operations than LMDB, even though it actually only performed 1/5th as many DB read operations. This shows it is over 5x less efficient than LMDB at reads.

LevelDB performed 5.6x more filesystem output operations than LMDB, even though it actually only performed 86% as many DB write operations as LMDB. This shows it is over 5x less efficient than LMDB at writes, despite the fact that most of its writes are sequential.

LevelDB used 9% more disk space than LMDB by the end of the test. We don't monitor how much disk space is consumed temporarily during the test. For LMDB there is no difference but for other log-based DB engines it is usually more, due to the copious volumes of logfiles generated during a run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment