Skip to content

Instantly share code, notes, and snippets.

@billdueber
Created January 25, 2012 18:00
Show Gist options
  • Save billdueber/1677617 to your computer and use it in GitHub Desktop.
Save billdueber/1677617 to your computer and use it in GitHub Desktop.
Ruby-marc slow on strict parser

I upgraded my ruby 1.8 to the latest patchlevel and all of a sudden ruby-marc was super-slow. I found the same thing on 1.9 and in JRuby, so I investigated.

There's a marc.bytes.to_a call inside the loop in Reader#decode. All the fix does is move it outside the loop so it only happens once.

You can see the patch in the "slowreadfix" branch at https://github.com/ruby-marc/ruby-marc/commit/beba83745ebe0848218496e967edd65d632fb01e

As you can see, the speedup is about a factor of five.

Test case is reading in a Marc21 file with about 18K records in it.

                      user     system      total        real

ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-darwin11.2.0]
  Strict (stock)    165.290000   1.480000 166.770000 (166.805960)
  Strict (fix)       25.190000   0.150000  25.340000 ( 25.333554)
  Forgiving (stock)  14.810000   0.080000  14.890000 ( 14.880064)
  Forgiving (fix)    14.540000   0.060000  14.600000 ( 14.611334)

ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0]
  Strict (stock  )   50.020000   1.260000  51.280000 ( 51.284811)
  Strict (fix)        9.530000   0.080000   9.610000 (  9.608464)
  Forgiving (stock)   6.080000   0.020000   6.100000 (  6.112313)
  Forgiving (fix)     6.320000   0.020000   6.340000 (  6.345729)

jruby 1.6.5.1 (ruby-1.8.7-p330) (2011-12-27 1bf37c2) (OpenJDK 64-Bit Server VM 1.7.0-u2-b21) [darwin-amd64-java]
  Strict (stock)    26.689000   0.000000  26.689000 ( 26.690000)
  Strict (fix)        5.061000   0.000000   5.061000 (  5.061000)
  Forgiving (stock)   5.389000   0.000000   5.389000 (  5.238000)
  Forgiving (fix)     5.092000   0.000000   5.092000 (  5.054000)

jruby 1.6.5.1 (ruby-1.9.2-p136) (2011-12-27 1bf37c2) (OpenJDK 64-Bit Server VM 1.7.0-u2-b21) [darwin-amd64-java]
  Strict (stock)     26.913000   0.000000  26.913000 ( 26.913000)
  Strict (fix)        5.336000   0.000000   5.336000 (  5.337000)
  Forgiving (stock)   5.385000   0.000000   5.385000 (  5.385000)
  Forgiving (fix)     5.219000   0.000000   5.219000 (  5.219000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment