- Don't install additional packages into the
base
environment - Give meaningful names to environments, e.g.,
PROJECT_NAME-env
- Always specify package version numbers
- Install all required packages in one command to reduce later conflicts
- Install
pip
in each environment to avoid using the system default version - Always version control the
environment.yml
file - Always export environments with the
--no-builds
flag - Creating a new environment is better than updating one, even with
--prune
- Use the
--force
option to overwrite an existing environment
Environments are created using the conda create
command.
conda create --name basic-ml-env python pip
Specific package versions can be specified as major.minor
version numbers
conda create --name basic-ml-env python=3.6 pip=20.0
By default, environmnets are created in the default environment directory but
an environmnet can also be created in a specific directory.
This is useful in case the environment is intended to be created in a
sub-directory called env
in the main project directory.
conda create --prefix ./env python pip
A named environment is activated using:
conda activate basic-ml-env
If the environment is in the env
directory of the project, then use:
conda activate ./env
An environment is deactivated using:
conda deactivate
Packages can be searched using:
conda search scikit-learn
Most packages are available in conda
repositories and can be installed as:
conda activate basic-ml-env
conda install numpy=1.18 scikit-learn=0.22
If a package is not available in the conda
repositories, it can be installed
via pip
.
conda activate basic-ml-env
pip install combo==0.1.*
When installing new packages in an environment, the --freeze-installed
option
of the conda install
command freezes the previously installed packages and
the required package with an older version might be installed for compatibility
without updating the versions of pre-installed packages.
conda env list
The list of packages for the current environment is obtained as follows:
conda list
while the list of packages for an arbitrary environment is obtained as:
conda list --name basic-ml-env
or
conda list --prefix /path/to/env
conda remove --name name-of-env --all
or
conda remove --prefix /path/to/env --all
The scaffolding of a conda
environment can be defined as a YAML text file.
The default file name is environment.yml
.
The command conda env create
looks for an environment.yml
file in the
current directory to create an associated environment.
If the environment file is saved with a different name than the default, the
following command can be used instead.
conda env create --file alt-environment-file-name.yml
The basic structure of an environment.yml
file is as follows:
name: machine-learning-env
dependencies:
- ipython
- matplotlib
- pandas
- python
- scikit-learn
- pip
- tensorflow=1.13
The above environment.yml
file would create an environment named
machine-learning-env
.
If the environment is intended to be created in the ./env
sub-directory, then
the name
property should be set to null
.
The following file snippet shows such an example and also includes version
numbers for the packages.
name: null
dependencies:
- ipython=7.13
- matplotlib=3.1
- pandas=1.0
- python=3.6
- scikit-learn=0.22
- pip=20.0
- tensorflow=1.13
The conda env export
command exports the environment details, e.g., the
environment name and list of packages with version and build information.
Using the --no-builds
flag allows to export an environment file such that
only the version numbers get specified.
This enables better environment reproducibility.
conda env export --name basic-ml-env --no-builds
An environment can be updated from an environment.yml
file as:
conda env update --name basic-ml-env --file environment.yml --prune
The --prune
flag allows to remove packages that are no longer required.
Creating an environment when another environment with the same name exists is useful if a fresh environment is required for some reason.
conda env create --name basic-ml-env --file environment.yml --force
If JupyterHub or JupyterLab are installed in the base
environment and it is
required that Jupyter should run based on the particular environment and not
from the base
environment, a kernel spec file can be created to enable this.
This may require the installation of the nb_conda_kernels
package in the
base
environment.
conda install jupyterlab nb_conda_kernels
Before, creating the kernel spec file, the conda
environment should have
the ipykernel
package installed.
The follwoing environment.yml
file can be used as a starting point for this
purpose.
name: xgboost-env
dependencies:
- ipykernel=5.3
- ipython=7.13
- matplotlib=3.1
- pandas=1.0
- python=3.6
- scikit-learn=0.22
- xgboost=1.0
- pip=20.0
Next, the specific conda
environment is created as:
conda env create --file environment.yml --force
Now, the environment is activated and the kernel spec file is created as:
conda activate xgboost-env
python -m ipykernel install --user --name xgboost-env --display-name "XGBoost"
The kernel spec fentries can also be removed using:
jupyter kernelspec list
jupyter kernelspec uninstall my-env # jupyter kernelspec remove my-env
Channels are used for distributing packages.
The Anaconda managed channels are refered to as defaults
and, as the name
suggests, packages are searched and installed from this channel by default.
conda-forge
is a popular, community managed channel that gets updated
frequently, while the packages in the defaults
channel get updated after
extensive quality control, and/or once a release is deemed stable enough.
Sometimes, conda-forge
also hosts packages that do not make their way into
the defaults
channel.
Installing a package from a specific channel requires the following:
conda install --channel conda-forge --name basic-ml-env scipy=1.3
Multiple channels can be specified in a single command, where the channel
specified first has higher priority than the channel specified later on.
For example, in the following command, conda-forge
has higher priority than
bioconda
.
conda install --channel conda-forge --channel bioconda scipy=1.3
Searching, and installing, a package from a specific channel is carried out as follows:
conda search conda-forge::kaggle
conda install --name basic-ml-env conda-forge::kaggle
The environment.yml
file can also provide a list of channels to be used for
searching and installing packages, including the channel priority.
The order of packages in an environment.yml
file does not imply priority but
the order of channels does imply priority.
name: deeplearning-env
channels:
- intel
- conda-forge
- defaults
dependencies:
- ipykernel=5.3
- ipython=7.13
- matplotlib=3.1
- pandas=1.0
- python=3.6
- scikit-learn=0.22
- xgboost=1.0
- tensorflow-intel=1.13
- pip=20.0