Created
April 9, 2016 11:49
-
-
Save hyc/913420265895e7fcc20473264324d05c to your computer and use it in GitHub Desktop.
DB migration test
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some results from working with the blockchain DB on a 5400rpm HDD (WD20EARX), starting with blockchain.raw from 2015-12-19 (874830 blocks). | |
The blockchain.raw file is on a separate drive, an SSD. | |
Import using v0.9.4 | |
2016-Apr-08 16:46:38.540175 End of file reached | |
2016-Apr-08 16:46:39.038292 Number of blocks imported: 874829 | |
2016-Apr-08 16:46:39.038366 Finished at block: 874829 total blocks: 874830 | |
2016-Apr-08 16:46:39.038984 Closing IO Service. | |
Command being timed: "./blockchain_import --data-dir /mnt/1/bitmo --database lmdb#nosync --verify off --input-file /home/hyc/Public/blockchain.raw" | |
User time (seconds): 235.51 | |
System time (seconds): 61.95 | |
Percent of CPU this job got: 6% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:12:21 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 6910624 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 91009 | |
Minor (reclaiming a frame) page faults: 1174101 | |
Voluntary context switches: 98735 | |
Involuntary context switches: 218044 | |
Swaps: 0 | |
File system inputs: 12429392 | |
File system outputs: 65239264 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 | |
violino:/home/software/bitmonero/build/release/bin094> ls -l /mnt/1/bitmo/lmdb | |
total 9349064 | |
-rw-r--r-- 1 hyc hyc 9564078080 Apr 8 16:46 data.mdb | |
-rw-r--r-- 1 hyc hyc 8192 Apr 8 16:50 lock.mdb | |
Exporting back to .raw format, again storing the raw file on SSD | |
2016-Apr-08 16:55:21.023009 Using block height of source blockchain: 874829 | |
block 874829/874829 | |
2016-Apr-08 17:21:48.845258 Number of blocks exported: 874830 | |
2016-Apr-08 17:21:48.850003 Largest chunk: 85111 bytes | |
2016-Apr-08 17:21:48.851219 Blockchain raw data exported OK | |
Command being timed: "./blockchain_export --data-dir /mnt/1/bitmo --output-file /home/hyc/Public/exp1.raw" | |
User time (seconds): 64.01 | |
System time (seconds): 22.67 | |
Percent of CPU this job got: 5% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 26:28.20 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 5821264 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 175817 | |
Minor (reclaiming a frame) page faults: 368449 | |
Voluntary context switches: 185151 | |
Involuntary context switches: 29123 | |
Swaps: 0 | |
File system inputs: 25230016 | |
File system outputs: 4813120 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 | |
violino:/home/software/bitmonero/build/release/bin094> ls -l ~/Public/*.raw | |
-rw-r--r-- 1 hyc hyc 439221 Mar 28 00:27 /home/hyc/Public/block1.raw | |
-rw-r--r-- 1 hyc hyc 2463754366 Dec 19 07:04 /home/hyc/Public/blockchain.raw | |
-rw-r--r-- 1 hyc hyc 2463754366 Apr 8 17:21 /home/hyc/Public/exp1.raw | |
All of this is quite slow because the old format stores the tx indices in non-sequential order, | |
but the import and export procedures want access in sequential order. So there are far too many | |
random accesses going on. | |
Migrating the DB in-place using the current patch: | |
2016-Apr-08 17:26:55.977352 LMDB Mapsize increased. Old: 10061MiB, New: 11085MiB | |
2016-Apr-08 17:26:55.977689 Migrating blockchain from DB version 0 to 1 - this may take a while: | |
2016-Apr-08 17:26:55.977736 updating blocks, hf_versions, outputs, txs, and spent_keys tables... | |
2016-Apr-08 17:26:55.977776 Total number of blocks: 874830 | |
2016-Apr-08 17:26:55.977813 block migration will update block_heights, block_info, and hf_versions... | |
2016-Apr-08 17:26:55.977841 migrating block_heights: | |
2016-Apr-08 17:27:05.433310 migrating block info: | |
2016-Apr-08 17:28:11.281047 migrating hf_versions: | |
2016-Apr-08 17:28:32.798102 Total number of outputs: 15664622 | |
2016-Apr-08 17:28:32.798169 outputs migration will update output_amounts and output_txs... | |
2016-Apr-08 17:28:32.798210 migrating output_amounts: | |
2016-Apr-08 17:54:35.525729 migrating output_txs: | |
2016-Apr-08 18:09:21.103997 Total number of txs: 1393439 | |
2016-Apr-08 18:09:21.104046 txs migration will update tx_indices, tx_outputs, and txs... | |
2016-Apr-08 18:09:21.104070 migrating tx_indices: | |
2016-Apr-08 18:15:44.070378 migrating txs and tx_outputs: | |
2016-Apr-08 19:41:59.214975 migrating spent_keys: | |
2016-Apr-08 20:19:38.443371 reorganizing from 864750 | |
2016-Apr-08 20:19:44.402316 reorganization done | |
Migrating the block indices takes only a matter of seconds, because they're in sequential order | |
in both old and new formats. The only change is to packing efficiency, really. | |
The output indices are in sequential order too; migrating just takes a long time | |
because there's such a large volume to read and write. | |
The tx indices take the most time because they were stored in hash order before. The tx_indices | |
table itself stays in hash order, so we can migrate that as a sequential operation. But the txs | |
and tx_outputs tables go from hash to sequential order, which again involves a lot of random | |
accesses. Worse because while txs and tx_outputs are both keyed with the hash, they didn't use | |
the same key comparator function, so they're in different order from each other. So we read | |
the txs table sequentially in hash order, but we're still doing random reads from the tx_outputs | |
table. And both are generating the new tables in random order. | |
The spent_keys table is all sequential; it just takes time because there are 12.5 million of them. | |
In contrast, just running blockchain_import from the performance branch took just 4-1/2 minutes: | |
2016-Apr-08 20:32:23.495492 End of file reached | |
2016-Apr-08 20:32:23.960825 Number of blocks imported: 874829 | |
2016-Apr-08 20:32:23.960894 Finished at block: 874829 total blocks: 874830 | |
2016-Apr-08 20:32:23.962182 Closing IO Service. | |
Command being timed: "./blockchain_import --data-dir /mnt/1/bitmo --database lmdb#nosync --verify off --input-file /home/hyc/Public/blockchain.raw" | |
User time (seconds): 196.46 | |
System time (seconds): 34.44 | |
Percent of CPU this job got: 86% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 4:27.51 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 6803776 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 482 | |
Minor (reclaiming a frame) page faults: 700730 | |
Voluntary context switches: 3633 | |
Involuntary context switches: 202247 | |
Swaps: 0 | |
File system inputs: 1882160 | |
File system outputs: 15390360 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 | |
violino:/home/software/bitmonero/build/release/bin> ls -l /mnt/1/bitmo/lmdb | |
total 7073532 | |
-rw-r--r-- 1 hyc hyc 7236210688 Apr 8 20:32 data.mdb | |
-rw-r--r-- 1 hyc hyc 8192 Apr 8 20:32 lock.mdb | |
Of course, this is faster because it's reading the .raw file from a separate drive than it's writing the DB to. | |
(But the v0.9.4 import was using the identical setup.) | |
This tells me that the current migrate function, which just attempts read all the old indices and rewrite them | |
again in sequential format, is the wrong approach. Instead it should just erase the old tables and then do the | |
equivalent of blockchain_import, regenerating all the indices from the original block and txs data. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment