Queens Performance.md

In https://github.com/rahulmutt/nofib, run using:

nofib-runner imaginary/queens --run --jmh="-wi 10 -i 10" --way="-O2"

Queens Solution:

-- !!! count the number of solutions to the "n queens" problem.
-- (grabbed from LML dist)

import System.Environment


main = do
	[arg] <- getArgs
	print $ nsoln $ read arg

nsoln nq = length (gen nq)
 where
    safe :: Int -> Int -> [Int] -> Bool
    safe x d []    = True
    safe x d (q:l) = x /= q && x /= q+d && x /= q-d && safe x (d+1) l

    gen :: Int -> [[Int]]
    gen 0 = [[]]
    gen n = [ (q:b) | b <- gen (n-1), q <- [1..nq], safe q 1 b]

gives the following result:

# JMH 1.15 (released 115 days ago)
# VM version: JDK 1.8.0_102, VM 25.102-b14
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 10 iterations, single-shot each
# Measurement: 10 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark: com.typelead.TestBenchmark.benchmark
# Parameters: (args = 10 +RTS)

# Run progress: 0.00% complete, ETA 00:00:00
# Fork: 1 of 1
# Warmup Iteration   1: 2192.577 ms/op
# Warmup Iteration   2: 491.270 ms/op
# Warmup Iteration   3: 439.554 ms/op
# Warmup Iteration   4: 402.247 ms/op
# Warmup Iteration   5: 407.231 ms/op
# Warmup Iteration   6: 150.034 ms/op
# Warmup Iteration   7: 169.938 ms/op
# Warmup Iteration   8: 156.885 ms/op
# Warmup Iteration   9: 152.766 ms/op
# Warmup Iteration  10: 146.529 ms/op
Iteration   1: 160.485 ms/op
Iteration   2: 154.438 ms/op
Iteration   3: 151.378 ms/op
Iteration   4: 144.522 ms/op
Iteration   5: 149.795 ms/op
Iteration   6: 149.698 ms/op
Iteration   7: 152.700 ms/op
Iteration   8: 143.453 ms/op
Iteration   9: 144.717 ms/op
Iteration  10: 153.089 ms/op


Result "benchmark":
  N = 10
  mean =    150.427 ±(99.9%) 7.927 ms/op

  Histogram, ms/op:
    [140.000, 142.500) = 0 
    [142.500, 145.000) = 3 
    [145.000, 147.500) = 0 
    [147.500, 150.000) = 2 
    [150.000, 152.500) = 1 
    [152.500, 155.000) = 3 
    [155.000, 157.500) = 0 
    [157.500, 160.000) = 0 
    [160.000, 162.500) = 1 
    [162.500, 165.000) = 0 
    [165.000, 167.500) = 0 

  Percentiles, ms/op:
      p(0.0000) =    143.453 ms/op
     p(50.0000) =    150.586 ms/op
     p(90.0000) =    159.880 ms/op
     p(95.0000) =    160.485 ms/op
     p(99.0000) =    160.485 ms/op
     p(99.9000) =    160.485 ms/op
     p(99.9900) =    160.485 ms/op
     p(99.9990) =    160.485 ms/op
     p(99.9999) =    160.485 ms/op
    p(100.0000) =    160.485 ms/op


# Run complete. Total time: 00:00:06

Benchmark                 (args)  Mode  Cnt    Score   Error  Units
TestBenchmark.benchmark  10 +RTS    ss   10  150.427 ± 7.927  ms/op

The first warmup iteration spends a LOT of time on classloading, something we should probably look into at some point.

rahulmutt/Queens Performance.md