-
-
Save telent/9742059 to your computer and use it in GitHub Desktop.
App configuration in environment variables: for and against | |
For (some of these as per the 12 factor principles) | |
1) they are are easy to change between deploys without changing any code | |
2) unlike config files, there is little chance of them being checked | |
into the code repo accidentally | |
3) unlike custom config files, or other config mechanisms such as Java | |
System Properties, they are a language- and OS-agnostic standard." | |
http://12factor.net/config | |
4) because the key and value both have to be plain text, it discourages | |
adding more complicated things as config settings when they really ought | |
not to need to be. Look at any mongoid.yml for example. Multi-level | |
config hashes are a code smell (my opinion) | |
Against: | |
1) Environment variables are 'exported by default', making it easy to | |
do silly things like sending database passwords to Airbrake. Sure we could | |
introduce code to filter them out, but it's another thing we need to | |
remember to update every time we add one - not robust in the face of | |
code changes. Better not to put them there in the first place | |
2) It provides the "illusion of security": env vars are really no more | |
secure than files, in that if you can read someone's files you can also | |
(quite easily in Linux) read the environment variables of their running | |
processes. This is not to say that files are better, just that they | |
don't pretend to be. | |
3) in some respect it's just deferring the problem: in order to start | |
your production instance those config variables still need to be read | |
from some source so they can be added to the environment, and 98% of the | |
time that source will be a local file. | |
4) if you restart an app by sending it a signal (e.g. SIGHUP) from an | |
unrelated shell that causes it to re-exec itself, it will still have the | |
environment of the original process. So for example, you can't update | |
config in environment variables and do a Unicorn "zero downtime" restart. | |
This can cause confusion | |
5) There is no single place in which to look to find out what settings are | |
accepted/required: even successfully starting the app doesn't mean that some code | |
path somewhere won't dereference an unset env var sometime later. We don't pass | |
parameters into modules using arbitrarily-named and undeclared globals, so | |
why is it OK to pass params into the main program that way | |
My argument: | |
is that what we're really asking for is a configuration source that | |
a) lives outside the project. This requirement could be met by | |
environment variables or a file in /etc or even a request to a web | |
server - see e.g. as the AWS instance metadata | |
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html | |
but in any case it should be difficult to accidentally merge production | |
config values into the project version control. | |
b) can be easily and reliably read using any of a variety of languages | |
(including shell scripts and the like) without complicated parsing code | |
or library dependencies | |
c) has limits on its expressitivity, so that people aren't trying to add | |
code that wants hashes and dates and lists and stuff like that as config | |
values | |
d) ideally, makes it hard to accidentally send the configuration values | |
to our collaborators and external services | |
Is that a fair representation of the arguments for/against, or am I | |
missing something? | |
One of the organizational problems I've seen with ENVs are when there are too many of them. When someone starts up a new environment for staging or production, they just copy all the ENVs from another environment and don't bother going through each one to see if it is valid or safe to copy. So you end up with a testing environment that can send real money. 🤑 What fun! Been there!
This can happen with configuration in code too but at least it gets checked in and there is usually a better review process.
So I usually go for a hybrid approach. Have most configurations committed and tracked in code in different files, one for development, production, etc. This makes it really clear which configurations are more important. Any configurations that can't be committed to code, use ENVs or a secure key store if you can handle the extra process complexity.
This approach can still have holes but at least the surface area is reduced.
We created Config, a SaaS for managing configuration files. We use an environment variable just to tell use what environment we are in, and use that information to pull the correct configuration file. This is all done during deployment time. The environment can come from the system itself, or taken from an Ansible variable. You can achieve a similar effect with Git, but Config specializes in configuration files, and seamlessly handles commonality and differences between environments.
lol, a SaaS for managing configuration.
hybrid approach seems to work quite well -- env var that determines configuration file source. Storing everything in env vars just moves the goalposts back further [where do the env vars come from? ...a file or deploy service correct?] also doesn't offer any more secure method of secure token storage... pub priv keys for accessing encrypted conf from a deploy service takes care of that.
Moving to almost exclusive use of containerised apps has made me completely convert to settings being set via environment variables (with defaults should they not exist...usually stored in a local config file).
-- I love zombie threads like this that continue to be relevant. :)
re: I love zombie threads like this that continue to be relevant -- they continue to be relevant when the basic question STILL does not have a single obviously correct answer.
This is still a very relevant topic today, indeed.
I haven't found a satisfactory solution to it yet.
BTW, this gist is referenced from https://github.com/juxt/aero/tree/743e9bc495425b4a4a7c780f5e4b09f6680b4e7a#use-environment-variables-sparingly
This Aero
library is worth considering, because its authors have really thought this problem domain through.
While it's been made for Clojure programs primarily, it uses the EDN format for declaring a configuration.
There are libraries for processing EDN written in many languages: https://github.com/edn-format/edn/wiki/Implementations
The ideas behind Aero are language independent though, so it's highly recommended to check it out at least.
What I've realized though, the problem of creating and maintaining state in your programs, which rely on configuration, is still not a very well solved problem.
The Clojure community is thinking about such state-management issues and there are numerous approaches to tackle it.
The same company who made Aero, attempts to solve these issues with their https://github.com/juxt/clip library.
re: What iolloyd said (in 2016) values that change depending on the environment should only be available in that environment.
Perhaps it is grammatically persnickety of me, but I'd like to point out that in this context, environment refers to things like:
- production, QA, staging, development, bug_fix_1, Wednesday, ....
- Asia, Europe, North America, Australia, ....
not to environment variables specifically.
Each of those run-environments may have a custom copy of the configuration file, and that meets the needs of separating configuration state from application code. As well as environment variables and command line parameters, one could have cascading config files. E.g.:
myApp --config=default.cfg --config=devel.cfg --config=test-bug-fix.cfg
myApp --config=default.cfg --config=production.cfg --config=NorthAmerica.cfg --config=UnitedStates.cfg --config=Colorado.cfg
Each file read in order, overriding whatever configuration had been built so far.
I agree with Jesse here, and I would (attempt to) disambiguate it by using the word "installation" (or perhaps "instance") in preference to "environment".
I haven't done more than skin the Aero documentation so far, but it's from Juxt so I assume it's good :-) There's another older and probably better-known precedent for a reasonably systematic way to configure from cascading files/environment variables, which is Ruby's bundle tool.
Ok adding here in support of envars:
What I've proposed to our dev team which was well received:
Still use a config file, but that config file solely defines WHAT environment variables the application is expecting. The config file is the same, regardless of the environment the code is being deployed to. This config file is committed to source control and is used as a self-documenting way of defining the expected config.
DO NOT export envars into the actual shell environment. Instead, only pre-load them at runtime and spawn a child process which is the app itself. Therefore the envars are locked into the application process. The key here is using
envdir
from daemontools which can run on any platform without any special config (not platform specific). The source of envdir values are pointing to a node specific locked down directory holding the values needed. This directory is the same on every node across our company/etc/mycorpname/envdir/
This can be further locked down by gpg encrypting the envdir source by using gpgenv as well. However we have not explored this yet.
The above methodology actually alleviates some of your arguments against (#1 and #5 are directly slashed). However there is always the chicken-egg scenario. In order to unlock your secrets, you need a secret. That originating secret in my scenario is the envdir directory itself... which is carefully locked down and lives outside of the source. It is unified across all nodes though due to the same envdir path being used, regardless of the node, environment, or application.