Created
November 7, 2008 22:57
-
-
Save Pistos/22990 to your computer and use it in GitHub Desktop.
Using better-benchmark.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'rubygems' | |
gem 'hpricot', '>=0.6.170' | |
require 'open-uri' | |
require 'hpricot' | |
require 'nokogiri' | |
require 'better-benchmark' | |
uri = URI.parse( "http://railstips.org/assets/2008/8/9/timeline.xml" ) | |
content = uri.read | |
hdoc = Hpricot.XML(content) | |
ndoc = Nokogiri.Hpricot(content) | |
#ndoc = Nokogiri.XML(content) | |
hdoc2 = Hpricot.scan(content) | |
puts "\nnokogiri vs. hpricot: Parsing XML" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 150, | |
:verbose => true | |
) { | |
Nokogiri.Hpricot content | |
}.with { | |
Hpricot.XML content | |
} | |
Benchmark.report_on result | |
puts "\nnokogiri vs. hpricot scan: Parsing XML" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 600, | |
:verbose => true | |
) { | |
Nokogiri.Hpricot content | |
}.with { | |
Hpricot.scan content | |
} | |
Benchmark.report_on result | |
puts "\nnokogiri vs. hpricot: Searching with XPath" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 200, | |
:verbose => true | |
) { | |
info = ndoc.xpath('//status/text').first.inner_text | |
url = ndoc.xpath('//user/name').first.inner_text | |
}.with { | |
info = hdoc.search('//status/text').first.inner_text | |
url = hdoc.search('//user/name').first.inner_text | |
} | |
Benchmark.report_on result | |
puts "\nnokogiri vs. hpricot (scanned): Searching with XPath" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 200, | |
:verbose => true | |
) { | |
info = ndoc.xpath('//status/text').first.inner_text | |
url = ndoc.xpath('//user/name').first.inner_text | |
}.with { | |
info = hdoc2.search('//status/text').first.inner_text | |
url = hdoc2.search('//user/name').first.inner_text | |
} | |
Benchmark.report_on result | |
puts "\nnokogiri vs. hpricot: Searching with CSS" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 200, | |
:verbose => true | |
) { | |
info = ndoc.search('status text').first.inner_text | |
url = ndoc.search('user name').first.inner_text | |
}.with { | |
info = hdoc.search('status text').first.inner_text | |
url = hdoc.search('user name').first.inner_text | |
} | |
Benchmark.report_on result | |
puts "\nnokogiri vs. hpricot (scanned): Searching with CSS" | |
result = Benchmark.compare_realtime( | |
:iterations => 10, | |
:inner_iterations => 200, | |
:verbose => true | |
) { | |
info = ndoc.search('status text').first.inner_text | |
url = ndoc.search('user name').first.inner_text | |
}.with { | |
info = hdoc2.search('status text').first.inner_text | |
url = hdoc2.search('user name').first.inner_text | |
} | |
Benchmark.report_on result |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I got a segfault when I tried to run this at first, and am now getting lots of | |
"called on terminated object" at random times (!). Wondering what to do at this | |
point... | |
hpricot 0.6.170 | |
nokogiri 1.0.2 | |
ruby 1.8.7 (2008-06-20 patchlevel 22) [i686-linux] | |
-------------------- | |
UPDATE: | |
Okay, when upgrading to nokogiri 1.0.3, and then removing old hpricot installs | |
(0.6 and 0.6.164), I was able to at least avoid segfaults and other errors. | |
But gee, the results under Ruby 1.8.7 aren't very flattering for hpricot? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nokogiri vs. hpricot: Parsing XML | |
.......... | |
Set 1 mean: 1.349 s | |
Set 1 std dev: 0.086 | |
Set 2 mean: 6.680 s | |
Set 2 std dev: 0.136 | |
p.value: 1.0825088224469e-05 | |
W: 0.0 | |
The difference (+395.1%) IS statistically significant. | |
nokogiri vs. hpricot scan: Parsing XML | |
.......... | |
Set 1 mean: 5.426 s | |
Set 1 std dev: 0.020 | |
Set 2 mean: 3.345 s | |
Set 2 std dev: 0.050 | |
p.value: 1.0825088224469e-05 | |
W: 100.0 | |
The difference (-38.3%) IS statistically significant. | |
nokogiri vs. hpricot: Searching with XPath | |
.......... | |
Set 1 mean: 0.079 s | |
Set 1 std dev: 0.014 | |
Set 2 mean: 8.383 s | |
Set 2 std dev: 1.185 | |
p.value: 1.0825088224469e-05 | |
W: 0.0 | |
The difference (+10473.4%) IS statistically significant. | |
nokogiri vs. hpricot (scanned): Searching with XPath | |
.......... | |
Set 1 mean: 0.066 s | |
Set 1 std dev: 0.021 | |
Set 2 mean: 2.908 s | |
Set 2 std dev: 0.169 | |
p.value: 1.0825088224469e-05 | |
W: 0.0 | |
The difference (+4302.5%) IS statistically significant. | |
nokogiri vs. hpricot: Searching with CSS | |
.......... | |
Set 1 mean: 0.386 s | |
Set 1 std dev: 0.053 | |
Set 2 mean: 9.140 s | |
Set 2 std dev: 0.153 | |
p.value: 1.0825088224469e-05 | |
W: 0.0 | |
The difference (+2265.3%) IS statistically significant. | |
nokogiri vs. hpricot (scanned): Searching with CSS | |
.......... | |
Set 1 mean: 0.373 s | |
Set 1 std dev: 0.074 | |
Set 2 mean: 3.060 s | |
Set 2 std dev: 0.296 | |
p.value: 1.0825088224469e-05 | |
W: 0.0 | |
The difference (+720.4%) IS statistically significant. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Use Nokogiri's XPath selectors for fastest speed - CSS-based search is faster than Hpricot but not as fast. | |
Also take note that this benchmark is only shows parsing of XML (not HTML). | |
This benchmark takes the original and uses better-benchmark instead. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment