Skip to content

Instantly share code, notes, and snippets.

@mediafinger
Last active March 3, 2020 12:03
Show Gist options
  • Save mediafinger/dcc35cca1b0d4a9c7e8a53bdd331f153 to your computer and use it in GitHub Desktop.
Save mediafinger/dcc35cca1b0d4a9c7e8a53bdd331f153 to your computer and use it in GitHub Desktop.
(Micro) Benchmarks of different implementations of a Hamming distance calculation in Crystal and Ruby

(Micro) Benchmarking Crystal vs Ruby

ips = iterations per second

Hamming distance, multiple implementations

See code examples below.

Crystal

   acoustep_1     2.76M ips  (362.26ns each)   2.59× slower
   acoustep_2     6.87M ips  (145.60ns each)   1.04× slower
          bew     2.33M ips  (430.06ns each)   3.07× slower
     btihen_1   943.81k ips  (  1.06µs each)   7.57× slower
     btihen_2   908.41k ips  (  1.10µs each)   7.86× slower
         nuin     1.01M ips  (990.84ns each)   7.08× slower
      trejkaz   924.19k ips  (  1.08µs each)   7.73× slower
mediafinger_1   293.21k ips  (  3.41µs each)  24.36× slower
mediafinger_2     4.22M ips  (236.97ns each)   1.69× slower
mediafinger_3     7.14M ips  (139.98ns each)        fastest
mediafinger_4     2.41M ips  (414.31ns each)   2.96× slower

Ruby

   acoustep_1:    57.875k ips  ( 17.28µs each)   2.15x  slower
   acoustep_2:   118.800k ips  (  8.42µs each)        same-ish
          bew:    23.923k ips  ( 41.80µs each)   5.20x  slower
     btihen_1:     8.970k ips  (111.48µs each)  13.87x  slower
     btihen_2:     8.999k ips  (111.12µs each)  13.82x  slower
         nuin:    47.034k ips  ( 21.26µs each)   2.64x  slower
      trejkaz:     9.189k ips  (108.83µs each)  13.54x  slower
mediafinger_1:    31.202k ips  ( 32.05µs each)   3.99x  slower
mediafinger_2:   117.624k ips  (  8.50µs each)        same-ish
mediafinger_3:   124.391k ips  (  8.04µs each)         fastest
mediafinger_4:    83.771k ips  ( 11.94µs each)   1.48x  slower

Comparison

Ruby is significantly slower than Crystal. The implementation where both are closest is one of the fastest Ruby implementations and obviously a non-idiomatic way for Crystal. The fastest implementation in both languages runs almost 60x faster in Crystal.

   acoustep_1     2.76M  vs   57.875k   47.7x slower
   acoustep_2     6.87M  vs  118.800k   57.8x slower
          bew     2.33M  vs   23.923k   97.4x slower
     btihen_1     0.94M  vs    8.970k  105.2x slower
     btihen_2     0.91M  vs    8.999k  100.9x slower
         nuin     1.01M  vs   47.034k   21.5x slower
      trejkaz     0.92M  vs    9.189k  100.6x slower
mediafinger_1     0.29M  vs   31.202k    9.4x slower
mediafinger_2     4.22M  vs  117.624k   35.9x slower
mediafinger_3     7.14M  vs  124.391k   57.4x slower
mediafinger_4     2.41M  vs   83.771k   28.8x slower

Implementations

Crystal

require "benchmark"

macro for(expr)
  {{raise "invalid syntax. use for e in c." unless expr.args.first.name.stringify == "in"}}

  {{expr.args.first.args.first}}.each do |{{expr.name.id}}|
    {{expr.args.first.block.body}}
  end
end

class Hamming
  def self.distance_acoustep_1(a, b)
    raise ArgumentError.new if a.size != b.size

    (0...a.size).count { |i| a[i] != b[i] }
	end

  def self.distance_acoustep_2(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    (0...a.size).count { |i| a[i] != b[i] }
	end

  def self.distance_bew(a : String, b : String)
    return 0 if a == b
    raise ArgumentError.new unless a.size == b.size

    a.each_char.zip(b.each_char).count { |(char1, char2)| char1 != char2 }
  end

  def self.distance_btihen_1(a : String, b : String) : Int32
    raise ArgumentError.new unless a.size == b.size

    a_as_array = a.each_char
    b_as_array = b.each_char

    bases_paired = a_as_array.zip(b_as_array)
    bases_paired.count { |char1, char2| char1 != char2 }
  end

  def self.distance_btihen_2(a : String, b : String) : Int32
    raise ArgumentError.new unless a.size == b.size

    b_as_array = b.each_char

    a.each_char.zip(b_as_array).count { |char1, char2| char1 != char2 }
  end

  def self.distance_nuin(a : String, b : String) : Int
    raise ArgumentError.new unless a.size == b.size

    a.each_char.with_index.count do |char, index|
      char != b[index]
    end
  end

  def self.distance_trejkaz(a : String, b : String)
    raise ArgumentError.new unless a.size == b.size

    a.each_char.zip(b.each_char).count { |char1, char2| char1 != char2 }
  end

  def self.distance_mediafinger_1(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    a_as_array = a.split("")
    b_as_array = b.split("")

    result = 0

    a_as_array.each.with_index do |char, index|
      result += 1 if char != b_as_array[index]
    end

    result
  end

  def self.distance_mediafinger_2(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    result = 0

    a.each_char.with_index do |char, index|
      result += 1 if char != b[index]
    end

    result
  end

  def self.distance_mediafinger_3(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    result = 0

    for index in (0...a.size) do
      result += 1 unless a[index] == b[index]
    end

    result
  end

  def self.distance_mediafinger_4(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    a.each_char.with_index.reduce(0) do | count, (char, index) |
      count += 1 unless char == b[index]
      count
    end
  end
end

Benchmark.ips(warmup: 2, calculation: 5) do |x|
  x.report("acoustep_1") { Hamming.distance_acoustep_1("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_acoustep_1("A", "A"); Hamming.distance_acoustep_1("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_acoustep_1("", "")}
  x.report("acoustep_2") { Hamming.distance_acoustep_2("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_acoustep_2("A", "A"); Hamming.distance_acoustep_2("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_acoustep_2("", "")}
  x.report("bew") { Hamming.distance_bew("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_bew("A", "A"); Hamming.distance_bew("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_bew("", "")}
  x.report("btihen_1") { Hamming.distance_btihen_1("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_btihen_1("A", "A"); Hamming.distance_btihen_1("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_btihen_1("", "")}
  x.report("btihen_2") { Hamming.distance_btihen_2("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_btihen_2("A", "A"); Hamming.distance_btihen_2("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_btihen_2("", "")}
  x.report("nuin") { Hamming.distance_nuin("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_nuin("A", "A"); Hamming.distance_nuin("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_nuin("", "")}
  x.report("trejkaz") { Hamming.distance_trejkaz("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_trejkaz("A", "A"); Hamming.distance_trejkaz("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_trejkaz("", "")}
  x.report("mediafinger_1") { Hamming.distance_mediafinger_1("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_1("A", "A"); Hamming.distance_mediafinger_1("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_1("", "")}
  x.report("mediafinger_2") { Hamming.distance_mediafinger_2("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_2("A", "A"); Hamming.distance_mediafinger_2("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_2("", "")}
  x.report("mediafinger_3") { Hamming.distance_mediafinger_3("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_3("A", "A"); Hamming.distance_mediafinger_3("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_3("", "")}
  x.report("mediafinger_4") { Hamming.distance_mediafinger_4("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_4("A", "A"); Hamming.distance_mediafinger_4("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_4("", "")}
end

Ruby

require "benchmark/ips"

class Hamming
  def self.distance_acoustep(a, b)
    raise ArgumentError if a.size != b.size

    (0...a.size).count { |i| a[i] != b[i] }
  end

  def self.distance_bew(a, b)
    return 0 if a == b
    raise ArgumentError unless a.size == b.size

    a.each_char.zip(b.each_char).count { |(char1, char2)| char1 != char2 }
  end

  def self.distance_btihen_1(a, b)
    raise ArgumentError unless a.size == b.size

    a_as_array = a.each_char
    b_as_array = b.each_char

    bases_paired = a_as_array.zip(b_as_array)
    bases_paired.count { |char1, char2| char1 != char2 }
  end

  def self.distance_btihen_2(a, b)
    raise ArgumentError unless a.size == b.size

    b_as_array = b.each_char

    a.each_char.zip(b_as_array).count { |char1, char2| char1 != char2 }
  end

  def self.distance_nuin(a, b)
    raise ArgumentError unless a.size == b.size

    a.each_char.with_index.count do |char, index|
      char != b[index]
    end
  end

  def self.distance_trejkaz(a, b)
    raise ArgumentError unless a.size == b.size

    a.each_char.zip(b.each_char).count { |char1, char2| char1 != char2 }
  end

  def self.distance_mediafinger_1(a, b)
    return 0 if a == b

    raise ArgumentError if a.size != b.size

    a_as_array = a.split("")
    b_as_array = b.split("")

    result = 0

    a_as_array.each.with_index do |char, index|
      result += 1 if char != b_as_array[index]
    end

    result
  end

  def self.distance_mediafinger_2(a, b)
    return 0 if a == b

    raise ArgumentError if a.size != b.size

    result = 0

    a.each_char.with_index do |char, index|
      result += 1 if char != b[index]
    end

    result
  end

  def self.distance_mediafinger_3(a, b)
    return 0 if a == b

    raise ArgumentError if a.size != b.size

    result = 0

    for index in (0...a.size) do
      result += 1 unless a[index] == b[index]
    end

    result
  end

  def self.distance_mediafinger_4(a, b)
    return 0 if a == b

    raise ArgumentError.new if a.size != b.size

    a.each_char.with_index.reduce(0) do | count, (char, index) |
      count += 1 unless char == b[index]
      count
    end
  end
end

Benchmark.ips do |x|
  x.report("acoustep") { Hamming.distance_acoustep("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_acoustep("A", "A"); Hamming.distance_acoustep("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_acoustep("", "") }
  x.report("bew") { Hamming.distance_bew("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_bew("A", "A"); Hamming.distance_bew("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_bew("", "") }
  x.report("btihen_1") { Hamming.distance_btihen_1("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_btihen_1("A", "A"); Hamming.distance_btihen_1("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_btihen_1("", "") }
  x.report("btihen_2") { Hamming.distance_btihen_2("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_btihen_2("A", "A"); Hamming.distance_btihen_2("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_btihen_2("", "") }
  x.report("nuin") { Hamming.distance_nuin("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_nuin("A", "A"); Hamming.distance_nuin("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_nuin("", "") }
  x.report("trejkaz") { Hamming.distance_trejkaz("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_trejkaz("A", "A"); Hamming.distance_trejkaz("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_trejkaz("", "") }
  x.report("mediafinger_1") { Hamming.distance_mediafinger_1("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_1("A", "A"); Hamming.distance_mediafinger_1("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_1("", "") }
  x.report("mediafinger_2") { Hamming.distance_mediafinger_2("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_2("A", "A"); Hamming.distance_mediafinger_2("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_2("", "") }
  x.report("mediafinger_3") { Hamming.distance_mediafinger_3("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_3("A", "A"); Hamming.distance_mediafinger_3("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_3("", "") }
  x.report("mediafinger_4") { Hamming.distance_mediafinger_4("GGACGGATTCTGACGTAC", "AGGACGGATTCTAGGACC"); Hamming.distance_mediafinger_4("A", "A"); Hamming.distance_mediafinger_4("GGACTGAAATCTGACGTA", "GGACTGAAATCTGACGTA"); Hamming.distance_mediafinger_4("", "") }

  x.compare!
end

Bonus

I used gobench to made a quick round of benchmarks with a Rails 5 app and an endpoint that lists some models from a database table and compared this with a scaffolded Amber app that does the same.

This test was quite hacky and I want to repeat this under more controlled circumstances. But under the given parameters (with 100 concurrent requests) the Crystal + Amber app handled around 4x as many requests per second as the Ruby on Rails app.

This does not look that impressive, but I bet there is a lot of potential to widen the gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment