Skip to content

Instantly share code, notes, and snippets.

@iande
Forked from ezkl/benchpress.md
Created February 7, 2011 14:22
Show Gist options
  • Save iande/814429 to your computer and use it in GitHub Desktop.
Save iande/814429 to your computer and use it in GitHub Desktop.

Nokogiri Parser Comparisons

Author: Ezekiel Templin
Date: February 06, 2011
Summary: Comparing Nokogiri's parsers

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i7 2.66 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH" is up to 16% faster over 1,000 repetitions

Nokogiri XPATH                0.13944482803344727 secs    Fastest
Nokogiri Search (w/ XPATH)    0.14812183380126953 secs    5% Slower
Nokogiri CSS                  0.16048192977905273 secs    13% Slower
Nokogiri Search (w/ CSS)      0.16649389266967773 secs    16% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin - Ian D. Eccles
Date: February 07, 2011
Summary: Comparing Nokogiri's parsers and RegEx for good measure.

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i5 2.53 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - /html/head/title" is up to 64% faster over 10,000 repetitions

Nokogiri XPATH - /html/head/title                0.5199141502380371 secs    Fastest
Nokogiri Search (w/ XPATH) - /html/head/title    0.6333937644958496 secs    17% Slower
Nokogiri XPATH - //head/title                    1.1117651462554932 secs    53% Slower
Nokogiri XPATH - //title                         1.129045009613037  secs    53% Slower
Nokogiri Search (w/ XPATH) - //title             1.2313289642333984 secs    57% Slower
Nokogiri Search (w/ XPATH) - //head/title        1.2370507717132568 secs    57% Slower
Nokogiri Search (w/ CSS) - title                 1.3102619647979736 secs    60% Slower
Nokogiri CSS - title                             1.3113257884979248 secs    60% Slower
Nokogiri CSS - head title                        1.318915843963623  secs    60% Slower
Nokogiri CSS - title                             1.3250529766082764 secs    60% Slower
Nokogiri Search (w/ CSS) - head>title            1.3271639347076416 secs    60% Slower
Nokogiri Search (w/ CSS) - title                 1.3291771411895752 secs    60% Slower
Nokogiri CSS - head>title                        1.3403031826019287 secs    61% Slower
Nokogiri Search (w/ CSS) - head title            1.356755018234253  secs    61% Slower
Nokogiri Search (w/ CSS) - html>head>title       1.380044937133789  secs    62% Slower
Nokogiri CSS - html>head>title                   1.415450096130371  secs    63% Slower
Nokogiri CSS - html head title                   1.455049991607666  secs    64% Slower
Nokogiri Search (w/ CSS) - html head title       1.4604699611663818 secs    64% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin - Ian D. Eccles
Date: February 07, 2011
Summary: Comparing Nokogiri's parsers and RegEx for good measure.

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i5 2.53 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - /html/head/title" is up to 65% faster over 10,000 repetitions

Nokogiri XPATH - /html/head/title                0.549346923828125  secs    Fastest
Nokogiri Search (w/ XPATH) - /html/head/title    0.6463708877563477 secs    15% Slower
Nokogiri XPATH - //title                         1.1517019271850586 secs    52% Slower
Nokogiri XPATH - //head/title                    1.1620349884033203 secs    52% Slower
Nokogiri Search (w/ XPATH) - //head/title        1.291672945022583  secs    57% Slower
Nokogiri Search (w/ XPATH) - //title             1.3146800994873047 secs    58% Slower
Nokogiri CSS - title                             1.3669140338897705 secs    59% Slower
Nokogiri Search (w/ CSS) - head title            1.3699350357055664 secs    59% Slower
Nokogiri CSS - head title                        1.3751940727233887 secs    60% Slower
Nokogiri Search (w/ CSS) - title                 1.3816111087799072 secs    60% Slower
Nokogiri CSS - title                             1.3922057151794434 secs    60% Slower
Nokogiri Search (w/ CSS) - title                 1.3927769660949707 secs    60% Slower
Nokogiri CSS - head>title                        1.430954933166504  secs    61% Slower
Nokogiri Search (w/ CSS) - head>title            1.438581943511963  secs    61% Slower
Nokogiri Search (w/ CSS) - html>head>title       1.4586009979248047 secs    62% Slower
Nokogiri CSS - html>head>title                   1.4640250205993652 secs    62% Slower
Nokogiri CSS - html head title                   1.5587737560272217 secs    64% Slower
Nokogiri Search (w/ CSS) - html head title       1.5759928226470947 secs    65% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin
Date: February 06, 2011
Summary: Comparing Nokogiri's parsers

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i7 2.66 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - Specific" is up to 74% faster over 1,000 repetitions

Nokogiri XPATH - Specific                     0.04272198677062988  secs    Fastest
Nokogiri Search (w/ XPATH) - Specific         0.051782846450805664 secs    17% Slower
Nokogiri XPATH - Semi-Specific                0.1356217861175537   secs    68% Slower
Nokogiri XPATH - Nonspecific                  0.14063310623168945  secs    69% Slower
Nokogiri CSS - Nonspecific                    0.1483469009399414   secs    71% Slower
Nokogiri Search (w/ CSS) - Nonspecific        0.15012288093566895  secs    71% Slower
Nokogiri Search (w/ CSS) - Semi-Specific      0.15072989463806152  secs    71% Slower
Nokogiri CSS - Semi-Specific                  0.15105509757995605  secs    71% Slower
Nokogiri Search (w/ XPATH) - Semi-Specific    0.15113496780395508  secs    71% Slower
Nokogiri Search (w/ XPATH) - Nonspecific      0.1534569263458252   secs    72% Slower
Nokogiri Search (w/ CSS) - Specific           0.16422104835510254  secs    73% Slower
Nokogiri CSS - Specific                       0.1705338954925537   secs    74% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin - Ian D. Eccles
Date: February 07, 2011
Summary: Comparing Nokogiri's parsers and RegEx for good measure.

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i5 2.53 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - Specific" is up to 67% faster over 1,000 repetitions

Nokogiri XPATH - Specific                     0.05413079261779785 secs    Fastest
Nokogiri Search (w/ XPATH) - Specific         0.06922698020935059 secs    21% Slower
Nokogiri XPATH - Semi-Specific                0.11409807205200195 secs    52% Slower
Nokogiri XPATH - Nonspecific                  0.11448025703430176 secs    52% Slower
Nokogiri Search (w/ XPATH) - Semi-Specific    0.12456512451171875 secs    56% Slower
Nokogiri Search (w/ CSS) - Semi-Specific      0.1319267749786377  secs    58% Slower
Nokogiri Search (w/ CSS) - Nonspecific        0.13361787796020508 secs    59% Slower
Nokogiri CSS - More Specific                  0.1415722370147705  secs    61% Slower *
Nokogiri CSS - Nonspecific                    0.141798734664917   secs    61% Slower
Nokogiri CSS - Semi-Specific                  0.1418309211730957  secs    61% Slower
Nokogiri Search (w/ CSS) - More Specific      0.1437239646911621  secs    62% Slower *
Nokogiri Search (w/ XPATH) - Nonspecific      0.14478492736816406 secs    62% Slower
Nokogiri Search (w/ CSS) - Specific           0.15114283561706543 secs    64% Slower
Nokogiri CSS - Specific                       0.1663680076599121  secs    67% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin - Ian D. Eccles
Date: February 07, 2011
Summary: Comparing Nokogiri's parsers and RegEx for good measure.

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i5 2.53 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - /html/head/title" is up to 60% faster over 1,000 repetitions

Nokogiri XPATH - /html/head/title                0.05964303016662598 secs    Fastest
Nokogiri Search (w/ XPATH) - /html/head/title    0.06176280975341797 secs    3% Slower
Nokogiri XPATH - //title                         0.11733198165893555 secs    49% Slower
Nokogiri XPATH - //head/title                    0.11788702011108398 secs    49% Slower
Nokogiri Search (w/ XPATH) - //head/title        0.12118005752563477 secs    50% Slower
Nokogiri Search (w/ XPATH) - //title             0.1237180233001709  secs    51% Slower
Nokogiri CSS - title                             0.12938594818115234 secs    53% Slower
Nokogiri Search (w/ CSS) - title                 0.1294267177581787  secs    53% Slower
Nokogiri Search (w/ CSS) - head title            0.12976598739624023 secs    54% Slower
Nokogiri Search (w/ CSS) - head>title            0.1303880214691162  secs    54% Slower
Nokogiri CSS - head>title                        0.13044309616088867 secs    54% Slower
Nokogiri CSS - head title                        0.13219928741455078 secs    54% Slower
Nokogiri CSS - title                             0.13598322868347168 secs    56% Slower
Nokogiri Search (w/ CSS) - title                 0.13603901863098145 secs    56% Slower
Nokogiri Search (w/ CSS) - html>head>title       0.1388840675354004  secs    57% Slower
Nokogiri CSS - html head title                   0.14284586906433105 secs    58% Slower
Nokogiri Search (w/ CSS) - html head title       0.14297819137573242 secs    58% Slower
Nokogiri CSS - html>head>title                   0.15062904357910156 secs    60% Slower
require 'nokogiri'
require 'bench_press'
extend BenchPress
name 'Nokogiri Parser Comparisons'
author 'Ezekiel Templin'
date '2011-02-06'
summary 'Comparing Nokogiri\'s parsers'
@doc = Nokogiri::HTML.parse(open('test.html'))
measure "Nokogiri CSS" do
@doc.css('title')
end
measure "Nokogiri XPATH" do
@doc.xpath('//title')
end
measure "Nokogiri Search (w/ XPATH)" do
@doc.search('//title')
end
measure "Nokogiri Search (w/ CSS)" do
@doc.search('title')
end
require 'nokogiri'
require 'bench_press'
extend BenchPress
name 'Nokogiri Parser Comparisons'
author 'Ezekiel Templin'
date '2011-02-06'
summary 'Comparing Nokogiri\'s parsers and RegEx for good measure.'
@file = File.open('test.html', 'r')
@xpath_non = "//title"
@xpath_semi = "//head/title"
@xpath_spec = "/html/head/title"
@css_non = "title"
@css_semi = "head title"
@css_spec = "html head title"
@css_spec_more = "html > head > title"
@doc = Nokogiri::HTML.parse(@file)
# XPath
measure "Nokogiri XPATH - Nonspecific" do
@doc.xpath(@xpath_non)
end
measure "Nokogiri Search (w/ XPATH) - Nonspecific" do
@doc.search(@xpath_non)
end
measure "Nokogiri XPATH - Semi-Specific" do
@doc.xpath(@xpath_semi)
end
measure "Nokogiri Search (w/ XPATH) - Semi-Specific" do
@doc.search(@xpath_semi)
end
measure "Nokogiri XPATH - Specific" do
@doc.xpath(@xpath_spec)
end
measure "Nokogiri Search (w/ XPATH) - Specific" do
@doc.search(@xpath_spec)
end
# CSS
measure "Nokogiri CSS - Nonspecific" do
@doc.css(@css_non)
end
measure "Nokogiri Search (w/ CSS) - Nonspecific" do
@doc.search(@css_non)
end
measure "Nokogiri CSS - Semi-Specific" do
@doc.css(@css_semi)
end
measure "Nokogiri Search (w/ CSS) - Semi-Specific" do
@doc.search(@css_semi)
end
measure "Nokogiri CSS - Specific" do
@doc.css(@css_spec)
end
measure "Nokogiri CSS - More Specific" do
@doc.css(@css_spec_more)
end
measure "Nokogiri Search (w/ CSS) - Specific" do
@doc.search(@css_spec)
end
measure "Nokogiri Search (w/ CSS) - More Specific" do
@doc.search(@css_spec_more)
end
@iande
Copy link
Author

iande commented Feb 7, 2011

What's interesting to me is how different the results can be between runs. Compare https://gist.github.com/814429#file_benchpress_more_specific_02.md and https://gist.github.com/814429#file_benchpress_more_specific.md. Same tests are being run, but suddenly the "html>head>title" selector becomes the slowest. There seems to be a lot of variability in the performance of CSS selectors that I wouldn't expect.

CSS is definitely getting smoked by XPath, there's no question of that. Maybe more than 1000 iterations need to be run to get more stable results amongst the other searches?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment