Skip to content

Instantly share code, notes, and snippets.

@pusewicz
Created August 28, 2025 14:59
Show Gist options
  • Select an option

  • Save pusewicz/659217a01b3550d57151be93d9b64b23 to your computer and use it in GitHub Desktop.

Select an option

Save pusewicz/659217a01b3550d57151be93d9b64b23 to your computer and use it in GitHub Desktop.
# frozen_string_literal: true
require "benchmark/ips"
array = Array.new(1_000_000) { rand(100) }.freeze
def sum_with_loop(array)
sum = 0
array.each do |num|
sum += num
end
sum
end
def sum_with_unrolled_loop_4_alt(array)
sum = 0
i = 0
len = array.length
while i <= len - 4
sum += array.values_at(i, i + 1, i + 2, i + 3).sum
i += 4
end
while i < len
sum += array[i]
i += 1
end
sum
end
def sum_with_unrolled_loop_4(array)
sum = 0
i = 0
len = array.length
while i <= len - 4
sum += array[i] + array[i + 1] + array[i + 2] + array[i + 3]
i += 4
end
while i < len
sum += array[i]
i += 1
end
sum
end
def sum_with_unrolled_loop_8(array)
sum = 0
i = 0
len = array.length
while i <= len - 8
sum += array[i] + array[i + 1] + array[i + 2] + array[i + 3] + array[i + 4] + array[i + 5] + array[i + 6] + array[i + 7]
i += 8
end
while i < len
sum += array[i]
i += 1
end
sum
end
def sum_with_unrolled_loop_16(array)
sum = 0
i = 0
len = array.length
while i <= len - 16
sum += array[i] + array[i + 1] + array[i + 2] + array[i + 3] + array[i + 4] + array[i + 5] + array[i + 6] + array[i + 7] +
array[i + 8] + array[i + 9] + array[i + 10] + array[i + 11] + array[i + 12] + array[i + 13] + array[i + 14] + array[i + 15]
i += 16
end
while i < len
sum += array[i]
i += 1
end
sum
end
Benchmark.ips do |x|
x.report("Standard Loop") { sum_with_loop(array) }
x.report("Unrolled Loop x4") { sum_with_unrolled_loop_4(array) }
# x.report("Unrolled Loop x4-alt") { sum_with_unrolled_loop_4_alt(array) }
x.report("Unrolled Loop x8") { sum_with_unrolled_loop_8(array) }
x.report("Unrolled Loop x16") { sum_with_unrolled_loop_16(array) }
x.compare!
end
🍔 ruby loop_unroll.rb
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +PRISM [arm64-darwin24]
Warming up --------------------------------------
Standard Loop 3.000 i/100ms
Unrolled Loop x4 6.000 i/100ms
Unrolled Loop x8 7.000 i/100ms
Unrolled Loop x16 8.000 i/100ms
Calculating -------------------------------------
Standard Loop 36.917 (± 2.7%) i/s (27.09 ms/i) - 186.000 in 5.040707s
Unrolled Loop x4 67.065 (± 1.5%) i/s (14.91 ms/i) - 336.000 in 5.012591s
Unrolled Loop x8 73.777 (± 2.7%) i/s (13.55 ms/i) - 371.000 in 5.031966s
Unrolled Loop x16 81.351 (± 2.5%) i/s (12.29 ms/i) - 408.000 in 5.018836s
Comparison:
Unrolled Loop x16: 81.4 i/s
Unrolled Loop x8: 73.8 i/s - 1.10x slower
Unrolled Loop x4: 67.1 i/s - 1.21x slower
Standard Loop: 36.9 i/s - 2.20x slower
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment