Let's say we would like to have a Python Project in a fashion like this
:~/projects/myproject$ tree -d .
.
├── notebooks
│ ├── notebook1.ipynb
│ └── notebook2.ipynb
└── src
├── package1
│ └── __init__.py
│ └── module1.py
└── package2
└── __init__.py
└── module2.py
So, you want to have src
and notebooks
in different directories. The problem with this arises when we want to import a module from a notebook: ipython
is not going to be able to find it.
# This code in notebooks/notebook1.ipynb
import module1
Why happens this? If we run any notebook in the notebooks
directory, the current dir will become ~/projects/myproject/notebooks
, and then any import looking for a package in src
directory is not going to work.
What can we do to fix this? Well, we want python
to look for packages at the src
directory. Python does this (as many other languages) by looking the environment variable PYTHONPATH
, which is a string of the form path1:path2:...
of the different directories at which python
must look for packages. [1]
As [2] says, this array of paths can be manipulated programmaticaly through sys.path
. A workaround then would be to have a small block of code at the beginning of the notebook like this:
import os
import sys
# this should be ~/projects/myproject/src
src_path = os.path.abspath("../src")
sys.path.append(src_path)
This code adds ~/projects/myproject/src
to PYTHONPATH
, so the interpreter can find future imports on src
's packages.
[1] Take care if you use something like pyenv
or virtualenvwrapper
. The environment variable PYTHONPATH
will not represent anything then.
[2] https://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH