This post provides a description of setting Astropy in a Django project that is run with Apache and mod_wsgi. I ran into enough issues that I decided to write them down for my future self.
For the GOTO project, I'm running the webpage as a Django project in an Apache server through mod_wsgi. This is overkill for the few, rather static, pages currently in use, but it is done somehwat in antipication of more complex pages, where database functionality is needed. Or, as recently, working with a form.
Django handles forms pretty well and there are plenty of examples around how to do this; the tricky part turned out to get Astropy to behave when run under Apache. The need for Astropy is given by the required processing done after the form submission. Since that took some time to figure out, I decided to write this down.
I have not been the only one struggling with this: there are some questions on the astropy mailing lists related to this. Instead of having this in separate posts on a mailing list, I thought it'd be good to have it together. Also, the logging issue, described below, appears to be new, or at least not something I could find (it is also something that is unlikely for people to run into).
-
Fedora release 22
-
Apache 2.4
Apache 2.4 has some notable differences with regard to permissions; not really necessary here, but something to be aware of.
Install using
$ dnf install httpd-devel
-
Python
Fedora 22 has Python 3.4 in its packages, but I went ahead and also installed Python 3.5.0, which is a straightforward installation into /usr/local.
We'll want a few packges installed, such as sqlite-devel, zlib-devel.
I have assumed compilers and pkgconfig are already installed.
$ ./configure --prefix=/usr/local --enable-shared $ make $ make test $ make install
The shared library is enabled for mod_wsgi to compile against.
-
mod_wsgi
mod_wsgi is one of the popular choices to run Python projects under Apache (having replaced mod_python quite a while ago).
Fedora 22 has python3-mod_wsgi, which is compiled for Python 3.4, and does a nice job replacing the mod_wsgi for Python 2 in the Apache conf directory: you can't have mod_wsgi for Python 2 and one for Python 3 at the same time.
mod_wsgi can nowadays be installed through pip, which installs wsgi-express:
$ pip3.5 install mod_wsgi
The mod_wsgi PyPI webpage has more details about mod_wsgi-express.
I cheated, and simply symlinked the resulting *.so library into /etc/httpd/modules, instead of using mod_wsgi-express.
Since I had previously installed python3-mod_wsgi, I did some renaming and then could re-use the Fedora setup for the Apache modules. Here are the relevant listings:
$ ls -l /etc/httpd/modules/mod_wsgi* -rwxr-xr-x. 1 root root 218224 Feb 13 2015 mod_wsgi.so -rwxr-xr-x. 1 root root 218800 Feb 13 2015 mod_wsgi_python3.4.so lrwxrwxrwx. 1 root root 100 Oct 28 16:50 mod_wsgi_python3.5.so -> /usr/local/lib/python3.5/site-packages/mod_wsgi/server/mod_wsgi-py35.cpython-35m-x86_64-linux-gnu.so lrwxrwxrwx. 1 root root 21 Oct 29 13:33 mod_wsgi_python3.so -> mod_wsgi_python3.5.so
$ cat /etc/httpd/conf.modules.d/10-wsgi.conf-inactive # NOTE: mod_wsgi can not coexist in the same apache process as # mod_wsgi_python3. Only load if mod_wsgi_python3 is not # already loaded. <IfModule !wsgi_module> LoadModule wsgi_module modules/mod_wsgi.so </IfModule>
$ cat /etc/httpd/conf.modules.d/10-wsgi-python3.conf # NOTE: mod_wsgi_python3 can not coexist in the same apache process as # mod_wsgi (python2). Only load if mod_wsgi is not already loaded. <IfModule !wsgi_module> LoadModule wsgi_module modules/mod_wsgi_python3.so </IfModule>
Apache is setup to load only *.conf files, so the Python 2 mod_wsgi module will be skipped.
-
Django
I'm using version 1.8.5, the most recent one at the time of writing.
$ pip3.5 install django
-
Astropy
This is also the most recent one at the time of writing: 1.0.6.
$ pip3.5 install astropy
Astropy will also install its dependency numpy.
I'm running Django in the default setup, so the basic wsgi file looks like:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()
This file will be changed below to accomodate Astropy.
I'm running Django as a separate user, gotoweb. But Apache runs as the apache user (other OSes may use www-data or nobody). So, at least for database operations, I need to give write access to the apache user:
$ whoami
gotoweb
$ setfacl -m user:apache:rwx /home/gotoweb/sites/gotoweb/gotoweb/databases/db.sqlite3
$ setfacl -m user:apache:rwx /home/gotoweb/sites/gotoweb/gotoweb/databases
(As far as I know, the directory where you want to change a file, also requires write permission).
You could set the permissions for the media
directory similarly,
though I would simly make the apache user the owner of that directory.
Astropy requires a configuration directory. Usually that is
$HOME/.astropy
, but the apache user doesn't have a home directory.
This is where the environment variables XDG_CONFIG_HOME
and
XDG_CACHE_HOME
come into play. Inside the /var/www/ directory (the
Apache document root directory), there are two directories,
astropyconfig and astropycache. Both contain a subdirectory astropy,
and both directories (and subdirectories) are owned by the apache
user:
$ mkdir -p /var/www/astropyconfig/astropy
$ mkdir -p /var/www/astropycache/astropy
$ chown -R apache:apache /var/www/astropyconfig
$ chown -R apache:apache /var/www/astropycache
I'm not sure if this is the best location for these directories, but
I've found them on our server already there (probably from a previous
setup), so I went with it. Somewhere in /etc/httpd or /etc/apache2
might be better, since the configuration files for Apache tend to live
there.
Now you need to set the environment variables. Don't try this in the
Apache configuration file with the use of mod_env and the SetEnv
directive; that will set the environment variables for a wider range
than you need. It's better to set the environment inside the Python
WSGI application. (For more details, see Graham Dumpleton's post on
this.)
I've chosen to adopt the wsgi.py file. Since these are settings local
to the system, I put them separately in an envvars.py file that is
imported into wsgi.py, with a template file that is checked into the
repository.
$ cat envvars_template.py
envvars = {
}
$ cat envvars.py
envvars = {
'XDG_CONFIG_HOME': '/var/www/astropyconfig',
'XDG_CACHE_HOME': '/var/www/astropycache'
}
and the wsgi.py file:
import os
from django.core.wsgi import get_wsgi_application
try:
from gotoweb.server.envvars import envvars
os.environ.update(envvars)
except ImportError:
pass
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()
Alternatively, you can set these environment variables in your Django settings; that is probably a nicer place. In my case, however, there was another reason to import astropy earlier, as described below.
It turns out there was one more issue: I had set up the Django logging
settings to silence the Astropy logging (it's good to have it, but
sometimes, Astropy becomes annoyingly noisy.). And that seems to
collide with Astropy itself: Astropy sets up a bunch of things when
first imported, including a logger. That logger is of class
AstropyLogger
, and I think that's where things went wrong.
I got errors in my Apache log like:
import astropy
[Fri Feb 26 04:22:04.879780 2016] [wsgi:error] [pid 18118] File "/home/gotoweb/.virtualenvs/gotoweb351/lib/python3.5/site-packages/astropy/__init__.py", line 286, in <module>
[Fri Feb 26 04:22:04.879784 2016] [wsgi:error] [pid 18118] log = _init_log()
[Fri Feb 26 04:22:04.879788 2016] [wsgi:error] [pid 18118] File "/home/gotoweb/.virtualenvs/gotoweb351/lib/python3.5/site-packages/astropy/logger.py", line 111, in _init_log
[Fri Feb 26 04:22:04.879792 2016] [wsgi:error] [pid 18118] log._set_defaults()
[Fri Feb 26 04:22:04.879798 2016] [wsgi:error] [pid 18118] AttributeError: 'Logger' object has no attribute '_set_defaults'
If you first let Django set up the 'astropy' logger, you haven't told it that it has its own class. Thus, Django creates and, importantly, initialises the 'astropy' logger with the standard logging.Logger class.
Next, astropy gets imported, and gets the 'astropy' logger:
logging.getLogger('astropy'). Normally, that creates the logger, and
astropy has just told the logging module to use the AstroLogger class,
with the line logging.setLoggerClass(AstropyLogger)
. But, the logger
doesn't get created, it simply is retrieved, and thus will not be of
the correct class. Any AstropyLogger specific attributes are now
missing, since it was previously created as a logging.Logger class.
Thus, the next line log._set_defaults()
crashes, since
_set_defaults()
is a method specific to AstropyLogger.
My solution so far (other than removing the 'astropy' logger from
Django settings module) is to import astropy
before running the
actual WSGI application. The final wsgi.py file, minus comments and
blank lines, is now:
import os
from django.core.wsgi import get_wsgi_application
try:
from gotoweb.server.envvars import envvars
os.environ.update(envvars)
except ImportError:
pass
try:
import astropy
except ImportError:
pass
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()
I have wrapped the astropy import in an try-except ImportError
as
well, so that if astropy is not around or fails upon importing, the
pages that don't require astropy will still be up and visible. (This
also means that in the Django views.py, I need to import astropy
inside a specific view class/function/method, and not at the top of
the views.py file, so that again, it only attempts to (re)load
astropy when it is needed.)
Importing astropy before running the WSGI application also requires setting the environment variables in wsgi.py, not in settings.py.
Finally, here is my shortened Apache configuration file for this
setup. The most important part here is the WSGIAplicationGroup %{GLOBAL}
line. This is a finicky thing to do with Numpy, which is an
Astropy dependency. Numpy can bypass the Python GIL, running
multiprocessed processes. Since mod_wsgi runs as a WSGIDaemonProcess
in a thread, you get multiprocessed processed processes in a thread
that is not the main thread. This can lead to deadlocks and other
issues: don't multiprocess in a thread if it's not the main thread.
Setting the WSGIApplicationGroup
to the %{GLOBAL}
server variable
avoids this issue. See the [mod_wsgi
wiki](https://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python\_Simplified\_GIL\
_State_API)
for a better explanation.
<VirtualHost *:80>
ServerName goto-observatory.org
ServerAlias www.goto-observatory.org
ServerAdmin [email protected]
Alias /static/ /home/gotoweb/sites/gotoweb/gotoweb/static/
Alias /media/ /home/gotoweb/sites/gotoweb/gotoweb/media/
<Directory /home/gotoweb/sites/gotoweb/gotoweb/static>
Require all granted
</Directory>
<Directory /home/gotoweb/sites/gotoweb/gotoweb/media>
Require all granted
</Directory>
WSGIDaemonProcess gotoweb python-path=/home/gotoweb/.virtualenvs/gotoweb35/lib/python3.5/site-packages:/home/gotoweb/sites/gotoweb
WSGIProcessGroup gotoweb
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias / /home/gotoweb/sites/gotoweb/gotoweb/server/wsgi.py
<Directory /home/gotoweb/sites/gotoweb/gotoweb>
Require all granted
<Files wsgi.py>
Require all granted
</Files>
</Directory>
CustomLog /var/log/httpd/gotoweb-access.log combined
ErrorLog /var/log/httpd/gotoweb-error.log
</VirtualHost>
Most of this follows the standard Django documentation on setting up the Apache configuration. Some notable differences and notes:
-
I have lazily put the static and media directories inside the project directory. That is generally not advised, so I'll be moving those away in due time.
-
The WSGIDaemonProcess has a python-path option, which is set to the site-packages for the Python executable that is installed in the virtual environment created by gotoweb. This way, it uses the Python packages that are installed in the virtual environment. It also includes the path to the Django project.
Note that the actual Python executable, the built-in modules and the Python shared library still reside in /usr/local, where they were installed earlier. The mod_wsgi shared object contains this library:
$ ldd mod_wsgi_python3.5.so linux-vdso.so.1 (0x00007ffe09bde000) libpython3.5m.so.1.0 => /usr/local/lib/libpython3.5m.so.1.0 (0x00007f6fdcefc000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6fdccc3000) libc.so.6 => /lib64/libc.so.6 (0x00007f6fdc903000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f6fdc6ff000) libutil.so.1 => /lib64/libutil.so.1 (0x00007f6fdc4fb000) libm.so.6 => /lib64/libm.so.6 (0x00007f6fdc1f3000) /lib64/ld-linux-x86-64.so.2 (0x0000003a40400000)
I recently ran into another, cryptic, error:
...
import numpy
...
SystemError: initialization of multiarray raised unreported exception
That last line gives very few Google hits. The top one dealt with 32 versus 64 bit, but I checked that that is not the case here: it's 64 bit all the way.
The 32/64 bit versions does reveal a hint. I decided to output the Apache/WSGI running environment to a HTML page:
import sys, os
context['python'] = str(sys.executable) + "\n" + str(sys.version)
context['env'] = "\n".join(["{}: {}".format(key, value) for key, value in os.environ.items()])
The environment didn't reveal too much, but my Python version was 3.5.0.
Which is odd: /usr/local/bin/python3.5
, which my virtualenv uses, is 3.5.1.
What is going on?
I figured the mod_wsgi so file still uses the older 3.5.0 Python so
library file, so I recompiled mod_wsgi. Or actually, I did a pip uninstall mod_wsgi
and pip install mod_wsgi
(in my virtualenv) and symlinked the mod_wsgi_python3.5.so
file to the newly created mod_wsgi so file (/home/webproject/.virtualenvs/webproject/lib/python3.5/site-packages/mod_wsgi/server/mod_wsgi-py35.cpython-35m-x86_64-linux-gnu.so
.
Good thing things path and file names aren't limited to, say, 8.3 characters).
Alas: no luck. Still the same error.
I compiled and reinstalled Python 3.5.1 from source: no luck. I then wanted to
compare the md5sum of /usr/local/lib/libpython3.5.so
with that in the Python
source directory, where I just compiled things. Behold: no libpython*.so
file in that directory.
Ah, of course: ./configure --prefix=/usr/local --enabled-shared
.
And now Python 3.5.1 builds with a shared library.
make altinstall
, verify the md5sums (and see that it is indeed a different one
now for /usr/local/lib/libpython3.5.so
), reinstall mod_wsgi to have it compiled
against the proper shared library, and everything works again as before.
The moral of this addendum is that a standard Python build will create a new (static) Python executable, but the old Python so library may remain. And mod_wsgi is very picky about minor versions, or rather: mod_wsgi still runs 3.5.0, but all my numpy/astropy/healpy etc libraries were compiled against the static Python 3.5.1 executable.
The same, by the way, holds true for numpy (which is the cause in the actual error
you're seeing), and probably lots of other compiled Python libraries: minor
version changes in Python and its shared object library can cause SystemError
s.