I personally just implement the features when I need them, as I go (as far as I’m able anyways). I think with a few people pushing we could get most of the important functionality of numpy and scipy built on top of arraymancer in a reasonable time.
In principle using Python where needed (but still accelerating it with
  Nim!) is pretty straight forward nowadays thanks to the advances in
  nimpy. I’ll show an example below. One major thing for this is still
  missing in nimpy, which makes it slightly more complicated than needed.
Assuming we want to integrate a 2D gaussian within some radius (I took this from a calculation of an optical fiber, which is slighly offset from a gaussian spot to calculate the amount of flux lost by the offset) using scipy, we might write the following Python code:
import numpy as np
from scipy.integrate import dblquad
def integrand(r, theta, delta_y, sigma):
    # integrand is the integrand of the function which will be 2D integrated
    r_prime_2      = r**2 + delta_y**2 - 2 * r * delta_y * np.cos(3.0/2.0 * np.pi - theta)
    value          = 1.0/(2.0 * np.pi * sigma**2) * np.exp(-r_prime_2/(2.0 * sigma** 2)) * r
    return value
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
res = dblquad(integrand, # definition of our gaussian integrand
              0.0,       # outer integral lower
              2.0*np.pi, # outer integral upper 
              lambda theta: 0.0, # inner integral lower
              lambda theta: a,   # inner integral upper
              args=(delta_y, sigma) # additional arguments
)Which is nice and all, but thanks to being pure Python, not very fast (in this case I needed to integrate a huge number of those).
So first we might just want to port it all over to Nim, but since we
  don’t have a numerical integration function, we want to call
  scipy.dblquad for it. The code equivalent to the above might look
  like the following:
let
  a = 1.0
  delta_y = 0.5
  sigma = 2.0
proc integrand(r, theta, delta_y, sigma: float): float {.exportpy.} =
  let r_prime_2 = pow(r, 2.0) + pow(delta_y, 2.0) - 2.0 * r * delta_y * cos(3.0 / 2.0 * Pi - theta)
  result = 1.0/(2.0 * Pi * pow(sigma, 2) ) * exp(-r_prime_2/(2.0 * pow(sigma, 2.0))) * r
let scipy = pyImport("scipy.integrate")
let res = scipy.dblquad(integrand,
                        0.0,
                        2.0 * Pi,
                        proc(theta: float): float = 0.0,
                        proc(theta: float): float = a,
                        args = (delta_y, sigma)
)The problem is, that this is currently not possible with nimpy
  (unless I’m missing something), since we can’t hand our Nim proc
  directly to Python, despite using the exportpy pragma.
  Our next best solution is what we might do in Python to accelerate our
  calculation, e.g. use ctypes, numba etc. to calculate the
  integrand. Let’s use Nim for that instead! So we define our integrand
  in Nim:
# integrate.nim
import nimpy, math
proc integrand(r, theta, delta_y, sigma: float): float {.exportpy.} =
  let r_prime_2 = pow(r, 2.0) + pow(delta_y, 2.0) - 2.0 * r * delta_y * cos(3.0 / 2.0 * Pi - theta)
  result = 1.0/(2.0 * Pi * pow(sigma, 2) ) * exp(-r_prime_2/(2.0 * pow(sigma, 2.0))) * rThen just compile this file with:
nim c -d:release --app:lib -o integrate.so integrate.nimNote, it’s important that the shared object file is called exactly like the Nim source file!
And then we just rewrite our Python code as follows:
import numpy as np
import time
from scipy.integrate import dblquad
import integrate
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
res = dblquad(integrate.integrand, # our Nim compiled function
              0.0,       # outer integral lower
              2.0*np.pi, # outer integral upper 
              lambda theta: 0.0, # inner integral lower
              lambda theta: a,   # inner integral upper
              args=(delta_y, sigma) # additional arguments
)
print(res)And that is all! Our Python code accelerated by Nim in a trivial way! :)
Just as a proof that it’s actually a lot faster:
import numpy as np
import time
from scipy.integrate import dblquad
import integrate
def integrand(r, theta, delta_y, sigma):
    # integrand is the integrand of the function which will be 2D integrated
    r_prime_2      = r**2 + delta_y**2 - 2*r*delta_y*np.cos(3.0/2.0*np.pi - theta)
    value          = 1/(2*np.pi*sigma**2)*np.exp(-r_prime_2/(2*sigma**2))*r
    return value
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
t0 = time.clock()
for i in xrange(1000):
    res = dblquad(integrand, # some python function
                  0.0,       # outer integral lower
                  2.0*np.pi, # outer integral upper 
                  lambda theta: 0.0, # inner integral lower
                  lambda theta: a,   # inner integral upper
                  args=(delta_y, sigma) # additional arguments
    )
t1 = time.clock()
print(res)
# instead call the nim proc!
t2 = time.clock()
for i in xrange(1000):
    res = dblquad(integrate.integrand, # some python function
                  0.0,       # outer integral lower
                  2.0*np.pi, # outer integral upper 
                  lambda theta: 0.0, # inner integral lower
                  lambda theta: a,   # inner integral upper
                  args=(delta_y, sigma) # additional arguments
    )
t3 = time.clock()
print(res)
print("Duration for pure python: ", (t1 - t0))
print("Duration for nimified python: ", (t3 - t2))
# result on my machine:
# (0.11410585193904128, 9.342312648695034e-12)
# (0.11410585193904128, 9.342312648695034e-12)
# ('Duration for pure python: ', 1.1764960000000002)
# ('Duration for nimified python: ', 0.13569299999999984)So that’s a ~8.7x speedup!
just add
numba.jiton the top line ofdef integrand,I get
but
@numba.jit(numba.float32(numba.float32[:], numba.float32[:], numba.float32, numba.float32))takes too longI failed to understand https://stackoverflow.com/questions/49683653/how-to-pass-additional-parameters-to-numba-cfunc-passed-as-lowlevelcallable-to-s to write the numba.cfunc version of the
integrandas I mentioned in https://stackoverflow.com/questions/52529526/how-to-use-numba-cfunc-handle-scipy-integrate-dblquad