Last active
December 15, 2015 06:09
-
-
Save MOON-CLJ/5213953 to your computer and use it in GitHub Desktop.
text xapian_weibo index performance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
index_text用法 replace | |
[2013-03-21 22:44:49] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-21 22:44:57] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-21 22:45:05] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-21 22:45:11] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-21 22:45:17] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-21 22:45:25] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-21 22:45:34] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-21 22:45:43] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-21 22:45:51] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-21 22:45:59] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-21 22:46:07] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-21 22:46:15] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-21 22:46:24] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-21 22:46:30] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-21 22:46:35] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-21 22:46:43] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 408.67 sec | |
index_text用法 add | |
[2013-03-21 23:03:09] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-21 23:03:17] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-21 23:03:25] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-21 23:03:30] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-21 23:03:36] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-21 23:03:44] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-21 23:03:52] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-21 23:04:01] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-21 23:04:08] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-21 23:04:16] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-21 23:04:24] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-21 23:04:32] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-21 23:04:41] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-21 23:04:46] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-21 23:04:53] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-21 23:05:00] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 400.86 sec | |
即replace和add在此50万的场景下,差别不大. | |
add_term | |
[2013-03-21 23:13:29] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-21 23:13:37] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-21 23:13:45] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-21 23:13:51] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-21 23:13:58] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-21 23:14:05] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-21 23:14:14] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-21 23:14:23] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-21 23:14:31] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-21 23:14:40] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-21 23:14:48] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-21 23:14:57] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-21 23:15:06] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-21 23:15:12] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-21 23:15:17] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-21 23:15:25] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 424.19 sec | |
remove single_word_whitelist过滤 | |
[2013-03-21 23:26:31] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-21 23:26:40] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-21 23:26:48] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-21 23:26:55] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-21 23:27:01] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-21 23:27:10] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-21 23:27:22] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-21 23:27:38] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-21 23:27:47] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-21 23:27:55] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-21 23:28:03] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-21 23:28:13] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-21 23:28:22] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-21 23:28:28] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-21 23:28:34] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-21 23:28:41] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 446.62 sec | |
反倒慢了,主要是因为单字的垃圾信息太多 | |
json.dumps版本 | |
[2013-03-22 21:41:49] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-22 21:41:59] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-22 21:42:08] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-22 21:42:13] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-22 21:42:19] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-22 21:42:27] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-22 21:42:35] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-22 21:42:43] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-22 21:42:51] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-22 21:42:58] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-22 21:43:06] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-22 21:43:16] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-22 21:43:26] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-22 21:43:31] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-22 21:43:37] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-22 21:43:44] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 403.72 sec | |
dumps_exclude 去掉多余的字段 | |
[2013-03-22 22:15:47] folder[_hehe_2011-08-21] num indexed: 350000 | |
[2013-03-22 22:15:55] folder[_hehe_2010-12-14] num indexed: 360000 | |
[2013-03-22 22:16:03] folder[_hehe_2010-12-14] num indexed: 370000 | |
[2013-03-22 22:16:10] folder[_hehe_2010-04-08] num indexed: 380000 | |
[2013-03-22 22:16:16] folder[_hehe_2010-10-25] num indexed: 390000 | |
[2013-03-22 22:16:24] folder[_hehe_2011-07-02] num indexed: 400000 | |
[2013-03-22 22:16:32] folder[_hehe_2011-10-10] num indexed: 410000 | |
[2013-03-22 22:16:41] folder[_hehe_2011-02-02] num indexed: 420000 | |
[2013-03-22 22:16:49] folder[_hehe_2011-02-02] num indexed: 430000 | |
[2013-03-22 22:16:56] folder[_hehe_2010-10-25] num indexed: 440000 | |
[2013-03-22 22:17:04] folder[_hehe_2010-07-17] num indexed: 450000 | |
[2013-03-22 22:17:12] folder[_hehe_2010-02-17] num indexed: 460000 | |
[2013-03-22 22:17:19] folder[_hehe_2009-09-20] num indexed: 470000 | |
[2013-03-22 22:17:25] folder[_hehe_2011-07-02] num indexed: 480000 | |
[2013-03-22 22:17:29] folder[_hehe_2010-07-17] num indexed: 490000 | |
[2013-03-22 22:17:36] folder[_hehe_2011-05-13] num indexed: 500000 | |
'index_weibos' 375.16 sec |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
开两个脚本一起跑
'index_weibos' 423.49 sec
'index_weibos' 432.44 sec
可以看出来,一个脚本跑并没有达到io瓶颈