Skip to content

Instantly share code, notes, and snippets.

@flcong
Last active June 16, 2024 01:35
Show Gist options
  • Save flcong/2eba0189d7d3686ea9633a6d14398931 to your computer and use it in GitHub Desktop.
Save flcong/2eba0189d7d3686ea9633a6d14398931 to your computer and use it in GitHub Desktop.
Global Variable in Julia: How to avoid using it?

Global Variable in Julia: How to avoid using it?

Recently, I tried to rewrite a MATLAB program in Julia. The program solves a PDE derived from a continuous-time economic model. I got the same result as the MATLAB program, but it was much slower. Then, I reviewed the Performance Tips of Julia and realized that the problem lied in using global variables. Typically, there are a lot of parameters in an economic model and they are typically directly defined as global variables. Then, for convenience, I wrote several functions to calculate some formulae which use these parameters. Since those functions were frequently called in a long loop, the performance is low.

To guide future programming practice, here I experiment several ways to avoid this problem.

Performance with/without global variables

Before digging into various ways to avoid this problem, let's first check how slow using global variables can be. To compare the computational time of different approaches, I use the BenchmarkTools package:

using BenchmarkTools

Consider the following code

μ = 1.0
σ = 0.8
a = 0.7
f(x) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)

function repeval()
    for i in 1:10000
        res = f(0.5)
    end
end

Here, I define there parameters μ, σ, and a as well as a function with argument x and the three variables. Then, I evaluate the function in a large loop. In order to use BenchmarkTools to obtain the time spent, I wrap the loop in a function. Then, I use the @btime macro to calculate the time:

@btime repeval()

The result is

  22.630 ms (390000 allocations: 9.61 MiB)

Next, consider the following code without using global variables, where the parameters are explicitly passed as arguments to the function:

μ = 1.0
σ = 0.8
a = 0.7
f(x, a, μ, σ) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)

function repeval()
    for i in 1:10000
        res = f(0.5, 0.7, 1.0, 0.8)
    end
end
@btime repeval()

The benchmark result is

  38.799 μs (0 allocations: 0 bytes)

To avoid using global variables and directly passing them as arguments, the code becomes over 500 times faster. However, when our economic model has many parameters, it is troublesome to pass all of these parameters to each function in the code. How can we achieve high performance while at the same time avoiding directly passing parameters to each function?

Comparison of different solutions

There are several candidates to achieve this

  • Define global constants.
  • Wrap the code in a function.
  • Use NamedTuple or the Parameters package.
  • Use Closure.

Define global constants

Instead of defining global variables, we can define global constants by prefixing the definition with the keyword const:

const μ = 1.0
const σ = 0.8
const a = 0.7
f(x) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)

function repeval()
    for i in 1:10000
        res = f(0.5)
    end
end
@btime repeval()

The result is

  38.799 μs (0 allocations: 0 bytes)

The performance is similar to passing all parameters as arguments. However, there is a drawback for this solution: Global constants cannot be redefined. Suppose I am unhappy with the value μ=1.0 and want to try μ=0.9. If I directly change the definition to const μ = 0.9, I will receive a warning

WARNING: redefinition of constant μ. This may fail, cause incorrect answers, or produce other errors.

and the value of μ is still 1.0. In order to change the value, I have to restart the Julia session, which means using all packages once again that may take a not-very-short time.

Wrap the code in a function

The second solution is to wrap the code in a function, including the parameter definitions. This is like a C/C++ type program where the entry point of the code is the main() function:

function main()
    μ = 1.0
    σ = 0.8
    a = 0.7
    f(x) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
    for i in 1:10000
        res = f(0.5)
    end
end
# Check time to evaluate the function 
@btime main()

The result is also similar to the previous solution:

  38.799 μs (0 allocations: 0 bytes)

It seems that from the perspective of the f function, the parameters are still "global". However, from Julia's perspective, as long as they are wrapped in a function, they are not global variables. The drawback of this solution is possible difficulty in debugging if you use Jupyter Notebook, because you cannot easily run the code in the function block-by-block. Yet, this is not a problem for Juno, where you can debug by stepping into the function and/or setting breakpoints.

Use NamedTuple or Parameters package

Another slightly more laborious way is to wrap all parameters in a NamedTuple or a struct and pass it to the function. There is also a package available called Parameters that facilitates the process.

Using NamedTuple, we have the following code

params == 1.0, σ = 0.8, a = 0.7)
function f(x, params)
    a, μ, σ = params
    return (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
end
function repeval()
    for i in 1:10000
        res = f(0.5, params)
    end
end
@btime repeval()

The output is

  316.599 μs (10000 allocations: 156.25 KiB)

Although it is 10 times slower than avoid using global variables, it is still a bit improvement.

We have similar result using the Parameters package:

@with_kw struct Params
    μ::Float64
    σ::Float64
    a::Float64
end
params = Params= 1.0, σ = 0.8, a = 0.7)
function f(x, params)
    @unpack a, μ, σ = params
    return (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
end
function repeval()
    for i in 1:10000
        res = f(0.5, params)
    end
end
@btime repeval()

The output is

  309.800 μs (10000 allocations: 156.25 KiB)

which is slightly faster compared with using the NamedTuple.

However, we can see the complication. Previously, I only need one line to define f , but now I need to have an extra line to unpack parameters. There is also an extra variable to pass to the function.

Use closure

The last (but not least) solution I consider is to use closure. We can define a function getfunc that receives parameters and return a function with parameters enclosed. If we need multiple functions like f here, we can also give getfunc another parameter to select which function to return and use if statement in it to return the correct function. Essentially, the function getfunc becomes a "closure-generator" that returns the desired function with parameters enclosed (closure).

Consider the following code:

params == 1.0, σ = 0.8, a = 0.7)

function getfunc(params)
    a, μ, σ = params
    f(x) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
    return f
end

function repeval()
    f = getfunc(params)
    for i in 1:10000
        res = f(0.5)
    end
end
@btime repeval()

The output is

  295.699 μs (10001 allocations: 156.28 KiB)

which is slightly better than using NamedTuple. The drawback is that we have to define an additional function to return functions. It could also be a benefit to include functions used in the code within a single function, which makes it easier to manage.

Conclusion

The experiments above can be summarized as follows:

  • Slowest: use global variables.
  • ~75x faster: use NamedTuple, Parameters package, or closure.
  • ~500x faster: use global constants, pass parameters to functions, or wrap the code in a function

The cons of these solutions are summarized as follows:

  • Use global constants:
    • Cannot change parameters without restarting the session.
  • Pass parameters to functions:
    • Troublesome if there are a lot of functions and a lot of parameters.
  • Wrap the code in a function:
    • Possibly difficult to debug in Notebook.
  • Use NamedTuple or Parameters package
    • Still need to pass an additional argument to the function and need extra code to unpack parameters.
  • Use closure
    • Need an extra function to return closures.

Which one would you choose?

@isaacgeng
Copy link

Thanks for your Gist! It's really helpful!
I think for prototyping I would go with consts, but later I have to translate it into functions or closures to make the logic clear and independent, it's also good for testing and reuse the function out of the project scope.

@flcong
Copy link
Author

flcong commented Apr 30, 2021

Thanks for your Gist! It's really helpful!
I think for prototyping I would go with consts, but later I have to translate it into functions or closures to make the logic clear and independent, it's also good for testing and reuse the function out of the project scope.

No problem! Yeah. I agree.

@alfaromartino
Copy link

alfaromartino commented Feb 16, 2023

I came across this post. Just in case, someone bumps into it, I just wanted to add some remarks. There are two issues:

  1. The case of NamedTuples and closure are type unstable, explaining why they're slower
  2. main() as a benchmark should be compared differently

ABOUT 1)
Compare the following code

#YOUR PROPOSAL
params == 1.0, σ = 0.8, a = 0.7)
function f(x, params)
    a, μ, σ = params
    return (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
end

function repeval()
    for i in 1:100_000_000
        res = f(0.5, params)
    end
end
@btime repeval() = 135.400 μs (10000 allocations: 156.25 KiB)

But in repeval you're not including params as an argument of the function. Consequently, it's as if params was a global variable. So, this is how it should be written

##### type stable
params == 1.0, σ = 0.8, a = 0.7)
function f(x, params)
    a, μ, σ = params
    return (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
end

function repeval(params) ### only difference relative to previous code
    for i in 1:100_000_000
        res = f(0.5, params)
    end
end

### TIME
@btime repeval(params) = 54.200 μs (0 allocations: 0 bytes)

which reduces the time. Let's compare it with main now.

** ABOUT 2)**
The function main() you use as benchmark already includes x=0.5 within the function. Look at the times when x is an argument of the function.

function main(x)
    μ = 1.0
    σ = 0.8
    a = 0.7
    f(x) = (x .+ a) ./+ σ^2) .* (1-a)/a ++0.5*σ^2) ./ (x.+μ.+0.5*σ^2) - log.(x.^2 .+ μ/σ*(1-a)/a)
    for i in 1:10000
        res = f(x)
    end
end

### TIME
@btime main(0.5) =  54.700 μs (0 allocations: 0 bytes)

which is practically the same as the time using NamedTuples.

MISCELANEOUS
When you have large broadcasting operations, consider two improvements. One is just purely notation

some tips to write the function in a cleaner way

# macro `@.`  adds `.`  automatically , so it's neater
#so both are equivalent
@. (x + 2) * 3
(x .+ 2) .* 3

The second is the use of the macro @turbo from LoopVectorization.jl, which speeds up broadcasting operations.

using LoopVectorization

params == 1.0, σ = 0.8, a = 0.7)

function repeval(x,p)
    a, μ, σ = p
    for i in 1:10000
        res = @turbo @. (x + a) /+ σ^2) * (1-a)/a ++0.5*σ^2) / (x+μ+0.5*σ^2) - log(x^2 + μ/σ*(1-a)/a) 
    end
end

#### TIME
@btime repeval(0.5, params) =  25.100 μs (0 allocations: 0 bytes)

Also, if you have a computer with multiple cores, you can use @Threads.threads

using LoopVectorization

params == 1.0, σ = 0.8, a = 0.7)
function repeval(x,p)
    a, μ, σ = p
    Threads.@threads for i in 1:10000
        res = @turbo @. (x + a) /+ σ^2) * (1-a)/a ++0.5*σ^2) / (x+μ+0.5*σ^2) - log(x^2 + μ/σ*(1-a)/a) 
    end
end

@btime repeval(0.5, params) = 8.300 μs (95 allocations: 10.61 KiB)

Hope it helps!!!

@flcong
Copy link
Author

flcong commented Feb 17, 2023

@alfaromartino

Thanks a lot for the detailed comment! Very useful, though I will probably not use Julia any more in the foreseeable future...

@VivaldoMendes
Copy link

Hi @alfaromartino and @flcong ,

I came across this nice exchange. In my humble laptop, a Dell Lattitude 5540 (and not having the top specifications), I ran the versions proposed by @alfaromartino, on a machine with Windows 11, Julia 1.10.4, and the latest versions of the three packages used: I got the following results (I follow the order above):

  • About 1, @flcong proposal: 179.500 μs (10000 allocations: 156.25 KiB)
  • About 1 ... type stable code: 73.100 μs (0 allocations: 0 bytes)
  • About ... the main function: 1.100 ns (0 allocations: 0 bytes)
  • About 2 ... macro @turbo: 34.400 μs (0 allocations: 0 bytes)
  • About 2 ... with threads : 35.400 μs (6 allocations: 704 bytes)

One observation and one doubt: (i) Julia is pretty fast with plain vanilla code in main; (ii) why is the code that uses LoopVectorization terribly slower than the plain vanilla version in main even in the case of 0 allocations?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment