I recently set up an internal PyPI server at work for hosting some of our internal Python packages. I used djangopypi and that was pretty easy to set up and get it working in the Django runserver. Most of the difficulty came in getting it to run in Apache with mod_wsgi. I always find mod_wsgi a little difficult to set up, so I thought I’d jot down a few notes on things I ran into and how I solved them, for future readers (including my future self! )
Prerequisite: Which Python are you going to use?
I’m going to briefly cover this, because getting Python installed on your OS of choice could be a blog post in itself. Are you going to use the system Python? This is easiest if your OS has a recent enough version that you want to use. Mine, CentOS 5.5, in this case comes with Python 2.4.3, which is pretty long in the tooth, so I wouldn’t bother with it. Even if it was a modern Python, I tend to shy away from doing much with the system Python, because it’s the system python, and I don’t want to do things and break my system. You might be OK with using a virtualenv of your system Python to get some isolation or you might do like I did in this case and simply install a fresh new version of Python with no connection whatsoever to the system Python. In fact, I installed a new Python and created a virtualenv in it specifically for the Django stuff I was doing.
Depending on your OS and package repositories, it might be an
apt-get install or
yum install away. For me it wasn’t and I downloaded and built it from source. Make sure that mod_wsgi is compiled against the Python version that you actually want to use. It’s pretty easy for it to link against your system Python or some other Python you don’t intend to use and that will cause problems. I built mine like this:
PATH=/usr/local/bin:/usr/bin:/bin ./configure --with-python=/usr/local/bin/python
sudo make install
I needed the
LD_RUN_PATH stuff so that
mod_wsgi.so could link with
/usr/local/lib/libpython2.7.so.1.0 at runtime. This installed
/usr/lib64/httpd/modules/mod_wsgi.so (which is also available at
/etc/httpd/modules/mod_wsgi.so because I already had a symlink from
Turn on mod_wsgi in the Apache config
For me, I had to do:
$ cat /etc/httpd/conf.d/wsgi.conf
LoadModule wsgi_module modules/mod_wsgi.so
Important: Turn off mod_python!
This is sort of commonly-known wisdom for people who are in the know about mod_wsgi, but I don’t set up mod_wsgi very often and so I forgot about it. And it caused great pain, because it caused a very mysterious failure. Basically, I had a state where a “Hello World” WSGI app worked just fine, but when I tried to use the
wsgi.py of my Django site, it would crap out silently. No errors in the Apache error log even with
LogLevel set to
debug. Through some ugly hacking of the
wsgi.py and some files in Django itself, I could see that it was silently dying inside Django when it got to the statement:
from threading import Lock
I did a bunch of searching and found posts of people having similar troubles but I didn’t find an answer.
And then I remembered to try turning off mod_python and that worked! Which was good, but it burned a lot of my time and psychic energy. All I had to do was comment out the following line in
# LoadModule python_module modules/mod_python.so
I wonder if mod_wsgi could be modified to detect the presence of mod_python and if it finds it, blast the logs with warnings? Frankly, I don’t know enough about Apache modules to know if this is possible.
Set up an Apache config that points to the wsgi.py
Mine looks roughly like this:
WSGIScriptAlias /djangopypi /www/python/django/djangopypisite/wsgi.py
This of course will have to be heavily customized depending on whether you’re using a virtualenv and where it is and where your app is located, etc.
WSGIPassAuthorization On line is something that I didn’t have at first and only later ran into problems and ended up adding it…
Test it out…
At this point, I had the Django app working more or less the same as it was working in the Django runserver except…
Make sure directories are writable by Apache, etc.
If you set up everything initially as your own user and then moved it over to mod_wsgi, then you might have files that belong to your user and which are not writable by Apache. Log directories, SQLite databases, directories for uploaded files (typically called “media” in Django paralance), etc.
Setting up static serving
The Django runserver makes things easy by serving static files for you. Django when deployed via mod_wsgi won’t serve static files by default. There are probably hacks to make it do that, but if you’re already running Apache it makes more sense to just have Apache serve those static files. That is certainly what the Django guys encourage. I set the
MEDIA_URL settings in
settings.py and then did some symlinking and Apache configuration so that Apache could serve the files out of
/www/python/media. And I created a directory
/www/python/static for stuff like CSS and JS files and collected them from the apps into this directory using
python manage.py collectstatic.
WSGIPassAuthorization On: The last piece of the puzzle
At this point, I had done a lot of hacking and got it mostly working. The one missing piece was that I could not upload packages to my custom PyPI server (i.e.: with
python setup.py register -r chegg sdist upload -r chegg). Actually it worked when I had my
~/.pypirc pointed to the Django runserver; it just didn’t work with the mod_wsgi version — it failed with
Upload failed (401): UNAUTHORIZED. This took a while to figure out, but I eventually found the reason and answer here. Basically this is because djangopypi does checking of the HTTP
Authorization header in order to see if you’re authorized, and mod_wsgi by default filters out this header as a security precaution, on the assumption that you would let Apache do the authorization check and not involve the WSGI apps in this. I imagine that this is something useful for providers of shared web hosting, but it obviously will not work if you have WSGI apps doing their own authorization, as I had in this case with djangopypi. The answer was to turn on
WSGIPassAuthorization by adding
WSGIPassAuthorization On to my Apache vhost config.
That’s all folks!
So now it’s working although it took annoyingly long to set up.
The temptation was to move on as quickly as possible to something else now since I spent more time than I wanted to on this. Surely I’m now a smarter, wiser person who learned and won’t run into these problems next time… On second thought, realistically I may not have to do this again for months…or years…and I know myself well enough to know that I probably won’t remember much of this in a year. So I decided to invest just a little more time in writing down some of the high-level points so that I (or someone else) can refer to them later. And now because I blogged it, a few months from now, I will only have to remember to search my blog to find this post…