Skip to content

Instantly share code, notes, and snippets.

@KristofferC
Last active May 25, 2020 07:08
Show Gist options
  • Save KristofferC/f269c61fba6392e2b8f9d54c385c1114 to your computer and use it in GitHub Desktop.
Save KristofferC/f269c61fba6392e2b8f9d54c385c1114 to your computer and use it in GitHub Desktop.
Regarding this PR that kind of came out of the blue: https://github.com/JuliaLang/Pkg.jl/pull/1835
I've been thinking more and more about it, and I really like the idea of having preferences, by default, get stored to depot-wide preference files, but for there to still be an option to set preferences within the top-level Project. What this would look like is if you were to call @save_preferences(prefs), it would overwrite whatever is within ~/.julia/prefs/$uuid.toml, However, if you write @save_preferences(prefs, target=:project), then it would save the preferences into the Project.toml file:
name = "MyPackage"
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
version = "1.0.0"
[deps]
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
[preferences.1a6f20f1-ba5f-a4e8-2e79-39224701f35d]
foo = "true"
(edited)
7:32
When you load_preferences(), it would first load the depot-wide preferences, then recursively call merge() on that dict with the appropriate subkey of preferences of the current project. This would allow for easy, as-reproducible-as-possible recording of preferences, while also allowing for machine-wide configuration.
:+1:
1
7:32
What do y'all think?
Kristoffer Carlsson 7:34 PM
- why is that a macro?
- what's the namespace of preferences?
Elliot Saba:horse: 7:36 PM
The macro versions are just shorthand to auto-detect the current module's root UUID
Kristoffer Carlsson 7:37 PM
It's not set by a user at "toplevel"?
Elliot Saba:horse: 7:37 PM
In the current PR, preferences just get saved to depot/prefs/uuid.toml, so it's all depot-wide.
7:38
A user certainly could call save_preferences(prefs, uuid) at top-level
7:38
But I figure most packages will have something like configure() that gets called if need be, which uses TerminalMenus to walk a new user through the configuration process, or such
7:39
Mostly I just wanted to provide packages with a super simple datastore, but then I realized that we probably want there to be not only depot-wide settings, but also per-toplevel-project settings.
:+1:
1
Dilum Aluthge 7:40 PM
It might be helpful to clarify for package authors what stuff goes in preferences and what stuff goes in scratch spaces.
7:41
I.e. as a package author, maybe a cheat sheet of which one I use in which situation
Elliot Saba:horse: 7:44 PM
Yep, I'll be writing the docs today.
:+1:
1
Dilum Aluthge 7:45 PM
What happens if you have three depots in your depot path, and each depot has preferences. Are the preferences merged from the three depots? Or do you only take the preferences from the first depot in the depot path?
Elliot Saba:horse: 7:49 PM
That's a good question; if depot -> project gets merged, then my gut instinct says that it should all be merged in the order of farthest depot -> nearest depot -> project (edited)
Kristoffer Carlsson 7:52 PM
Generally, stacking depots just gets you access to new stuff, but it isn't really an "overwriting" behavior. Might be good to think about. Perhaps only the first depo should be considered?
Kristoffer Carlsson 7:53 PM
Also, why depot and not LOAD_PATH ? Thx slack
8 replies
Last reply 16 hours agoView thread
Elliot Saba:horse: 7:56 PM
Partly, I chose depot because that's what certain packages in the ecosystem are doing already; e.g. IJulia, FFTW, and PyCall are all putting stuff into ~/.julia/prefs already.
7:56
I'm not sure what putting stuff onto the LOAD_PATH would look like; could you explain that more?
Kristoffer Carlsson 7:57 PM
I meant if they can be entries in your Project, shouldn't they always be entries in a Project file. And load path determines the merge order.
7:58
So if you have the global environment activated, prefs goes to that one. If you activate another project, prefs go to that one, just like Pkg.add
7:58
And just like using it looks in your active project first and then falls back to the next entry in load path.
Elliot Saba:horse: 8:05 PM
Hmmm, interesting. So instead of writing them out to ~/.julia/prefs, I would write them out to ~/.julia/environments/v1.4/Project.toml. And if I'm running in julia --project=~/src/Foo, then it looks in ~/src/Foo/Project.toml.
8:06
That certainly has a benefit for reproducibility, so that's a big plus
8:07
The only downside that I can think of is that for every time I use a package that requires configuration in a new project, I have to repeat my configuration.
8:07
There's no way to set a "machine-wide default configuration"
Kristoffer Carlsson 8:07 PM
You would still have the "global" preferences in the global project (by default)
8:07
But yeah, machine defaults true
Elliot Saba:horse: 8:08 PM
I'm thinking things like; we have an MPI package that can be set to use a certain flavor of MPI. It would be really useful for the system administrator to be able to set up preferences within a system depot, then have those be used as defaults.
:+1:
2
8:09
Perhaps we can structure it such that saving things into the current project is the default, but we provide an API for sysadmins to drop preferences into depots
:+1:
1
8:09
So by default, 99% of packages and users are saving stuff into the currently-active project
8:09
but we at least have the capability to provide "machine-wide-defaults" that will get overridden by settings stored within Project.toml
Dilum Aluthge 8:34 PM
That would be good
8:34
In my case, the user depot is first in the depot list, and the system depot is second. I'd like to write preferences to the system depot and have them used as defaults.
8:39
Anyway this is all incredible
8:40
We keep getting closer to immutable package directories
Mosè Giordano:house_with_garden: 8:40 PM
not really
Dilum Aluthge 8:40 PM
:(
Mosè Giordano:house_with_garden: 8:41 PM
lots of packages still write a build/deps.jl file, even without using BP
8:41
FFTW, MPI and HDF5 are the first examples that come to my mind
Dilum Aluthge 8:42 PM
I guess the question really is, can we eliminate the build step?
Specifically, what tasks do people do in the build step that cannot be accomplished by a combination of artifacts, preferences, and scratch spaces?
Mosè Giordano:house_with_garden: 8:43 PM
all the packages I mentioned above write a file to specify which library they actually want to use. can this be accomplished with a scratch space?
Dilum Aluthge 8:45 PM
Hmm. "Which library they want to use" actually sounds more like a preference, right?
8:45
but either way
Mosè Giordano:house_with_garden: 8:45 PM
that's not enough, they actually already use preferences
Dilum Aluthge 8:45 PM
I see
8:45
@staticfloat could you write that kind of info to a scratch space?
Mosè Giordano:house_with_garden: 8:45 PM
they need a julia file with something like
const = "libname.so"
to be ccall ed
Dilum Aluthge 8:46 PM
I see
Mosè Giordano:house_with_garden: 8:46 PM
or just do using Libfoo_jll when the preference is for the library provided via a JLL package
Elliot Saba:horse: 8:50 PM
My plan for things like FFTW is that they:
* Use preferences to store which library they're using
* Use scratch spaces to store auxiliary files such as the julia file that contains the const libname="libfftw3f.so", which can be include()'ed at compile time.
I think this should even allow one project to have BB-provided FFTW and another project to have MKL-provided FFTW from the same source FFTW path. They can read the preference, and from that preference, decide which files to include(). (edited)
8:51
Ah, no, that won't work, because the precompile file will cache the results of the if statement. Nevermind
8:51
You'd still only be able to have one configuration of such a package per depot. That's still fine and an improvement.
Mosè Giordano:house_with_garden: 8:53 PM
ok, maybe I misunderstood what a "scratch space" means. to me it sounds like temporary, if it's persistent I guess it's fine, it'd be like what people now do in build/
Elliot Saba:horse: 8:55 PM
Hmmm
8:55
I have to be careful about doing things at compile-time
8:55
because the thing that stops a scratch space from being deleted is calling get_scratch!(), but if that doesn't get called, and the result is just always cached in a .ji file
8:56
then yes, it could disappear out from under FFTW
8:56
I guess FFTW would just need to call get_scratch!() within its __init__() method, even if it doesn't use the value, just to make sure it doesn't get GC'ed.
8:56
that's not too bad
Mosè Giordano:house_with_garden: 8:57 PM
just to increase __init__ time a little bit more :troll: :stuck_out_tongue:
:+1:
1
Elliot Saba:horse: 9:01 PM
That's my unofficial job title
:joy:
1
Elliot Saba:horse: 9:15 PM
Alright, last design question for Preferences: If the sysadmin has set preferences in a depot, do we want those preferences to be included within the Project.toml preferences? I think I'd rather error on the side of "yes"; in that if you say:
@modify_preferences() do prefs
prefs["foo"] = "bar"
end
And prefs has inherited a backend = "libzoomzoom" mapping, then the preferences that get saved to Project.toml will include that backend mapping.
9:15
The alternative is to "subtract out" the things saved in the depots higher than yourself, such that only the "diff" is saved into the Project.
Mosè Giordano:house_with_garden: 9:19 PM
probably in #hpc you can find several people dealing with sysadmins with varying preferences :slightly_smiling_face:
Dilum Aluthge 9:23 PM
I think it makes more sense to only save the diff
9:23
But yeah worth asking for more opinions. Ever cluster is different
9:28
The diff thing sounds complicated
9:28
Easier to just save all the stuff into the Project. Easier to debug to.
Elliot Saba:horse: 9:30 PM
It is certainly easier to save the whole thing
:+1:
1
Dilum Aluthge 10:05 PM
Also, it means you can just look at one file (the Project.toml file) and quickly figure out what all the preferences are
Simon Byrne:juliaspinner: 10:09 PM
I'm still not exactly sure how a system-wide depot would work in the case of MPI
10:09
e.g. our local cluster has ~10 different MPI modules
Simon Byrne:juliaspinner: 10:10 PM
I guess you would need a separate depot for each of them? (edited)
1 reply
Today at 12:07 AMView thread
Simon Byrne:juliaspinner: 10:11 PM
and they would need to be somehow kept up-to-date, so that e.g. a new MPI.jl release doesn't result in the user installing the latest one in their own depot and using that (edited)
Elliot Saba:horse: 12:02 AM
Why would a user installing a newer version be bad?
Valentin Churavy:juliaspinner: 2:12 AM
I think Simon concern is what happens when you upgrade to your preference file. E.g. is it per package or is it per package + version
2:13
The issue with MPInis a fair bit complicated since we need to discover the ABI
2:14
And we need to use the system version, but the user may change the system version at anytime and thus get ABI mismarches
Simon Byrne:juliaspinner: 8:02 AM
I guess I still don't quite understand what is being proposed, but the main challenge when I attempted to do something similar with MPI.jl was figuring out when to invalidate the precompile cache. Basically, we need to invalidate the cache whenever the user preferences change (e.g. use /foo/libmpi instead of /bar/libmpi), or when the underlying library changes (e.g. the user module loads a different implementation, which changes LD_LIBRARY_PATH and dlopen(:libmpi) will now resolve to a different file). (edited)
8:06
This provide difficult since you need to
(a) repeat your preference logic inside the module __init__(),
(b) somehow check if the library is the same (at the moment we call a specific function which returns info on the implementation, and check against a cached value), and
(c) need a way to invalidate the precompile cache (I never came up with a good way to do this, so reverted back to a deps/deps.jl file) (edited)
Elliot Saba:horse: 9:44 AM
Yeah, this will not help with needing to invalidate the precompile cache
Kristoffer Carlsson 11:44 AM
There's a bit of a fundamental difference of what is build flags (stuff you would set in e.g. CMake) and runtime preferences (stuff that would be e.g. command line flags). Might be worth separating those two concepts?
AFAIU, the precompilation system quite fundamentally assumes that the only things that invalidate stuff are modifications to files. And the way you add a dependency to a file so that the cache invalidates is by using include_dependency You don't really want to add a dependency on the whole Project file but perhaps we could print out a file that contain only the "build flags" and then the package does a include_dependency on that file and when we change the build flags in the project file we also update that file, causing the cache to invalidate?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment