Plone and Zope time

Doing my yearly ritual of looking at Plone & Zope and making sure I still don’t grok them. Check.

In this year’s installment, I set up a little Plone site and actually got as far as creating a custom content type. I started down the road of thinking I would create the content type manually with code in Archetypes. Then I looked at tutorials and saw that it looked like it involved a lot of repetitive code and configuration voodoo. I ended up using ZopeSkel to create a really, really simple Archetypes content type, which actually didn’t take too long to figure out (to my pleasant surprise).

It still feels like too much work and it’s difficult to find good documentation on what’s the current recommended way to do things, because the recommended way of doing things seems to change every couple of years and there is a lot of old, outdated documentation out there. A search on Archetypes for instance yielded all kinds of pages referring to practices that seem to be outdated now, like manually creating the directory structure for an Archetypes product (i.e.: no ZopeSkel) and copying the product into the Zope 2 “Products” directory (which seems to be deprecated in favor of buildout)

In comparison, creating a model in Django is very simple, Pythonic, and well-documented.

Continuation of mod_wsgi setup

My previous setup worked until I wanted to add another Django instance and then it didn’t work, because I had two WSGIScriptAlias entries under different URLs but it would launch whichever app it wanted to regardless of the URL. A little searching suggested that it had to do with mod_wsgi process pools. I eventually ended up with something like this:

WSGIPythonHome /www/python/pypi.venv
WSGIPythonPath /www/python/django
WSGIPassAuthorization On
WSGISocketPrefix /var/run/wsgi

<VirtualHost *:80>

    ...

    DocumentRoot /www/python

        WSGIDaemonProcess djangoplayground_wsgi user=apache python-path=/www/python/django
        WSGIScriptAlias /djangoplayground /www/python/django/djangoplayground/wsgi.py
        <Location /djangoplayground>
            WSGIProcessGroup djangoplayground_wsgi
        </Location>

        WSGIDaemonProcess djangopypi_wsgi user=apache python-path=/www/python/django
        WSGIScriptAlias /djangopypi /www/python/django/djangopypisite/wsgi.py
        <Location /djangopypi>
            WSGIProcessGroup djangopypi_wsgi
        </Location>
</VirtualHost>

djangotestxmlrpc

I had been working on DjangoPyPI and trying to fix some tests related to XML-RPC that were failing in versions of Python < 2.7, because the testing was relying on specifics of the implementation of xmlrpclib in Python 2.7. I found some code at this blog post from Forest Bond that showed a better way to test XML-RPC views in Django, namely, by creating a special XML-RPC transport object that uses the Django test client. I extracted this out and tweaked it slightly to add tests and better compatibility across Python versions and published it on PyPI as…


djangotestxmlrpc

Build Status

Test Django XML-RPC views using the Django test client. Because you’re using the Django test client, you’re not actually sending HTTP requests and don’t need a server running.

Example usage:

from djangotestxmlrpc import DjangoTestClientXMLRPCTransport

class TestXmlRpc(django.test.TestCase):
    ...

    def test_list_package(self):
        pypi = xmlrpclib.ServerProxy(
            "http://localhost/pypi/",
            transport=DjangoTestClientXMLRPCTransport(self.client))
        pypi_hits = pypi.list_packages()
        expected = ['foo']
        self.assertEqual(pypi_hits, expected)

Supported Python versions

  • Python 2.5
  • Python 2.6
  • Python 2.7
  • PyPy 1.9
  • Python 3.1
  • Python 3.2

or says tox:

~/dev/git-repos/djangotestxmlrpc$ tox
...
  py25: commands succeeded
  py26: commands succeeded
  py27: commands succeeded
  pypy: commands succeeded
  py31: commands succeeded
  py32: commands succeeded
  congratulations :)

You also can check the latest Travis CI results, but Travis doesn’t build all of the above platforms.

Issues

Send your bug reports and feature requests to https://github.com/msabramo/djangotestxmlrpc/issues

Some tips for setting up Apache and mod_wsgi

I recently set up an internal PyPI server at work for hosting some of our internal Python packages. I used djangopypi and that was pretty easy to set up and get it working in the Django runserver. Most of the difficulty came in getting it to run in Apache with mod_wsgi. I always find mod_wsgi a little difficult to set up, so I thought I’d jot down a few notes on things I ran into and how I solved them, for future readers (including my future self! :-))

Prerequisite: Which Python are you going to use?

I’m going to briefly cover this, because getting Python installed on your OS of choice could be a blog post in itself. Are you going to use the system Python? This is easiest if your OS has a recent enough version that you want to use. Mine, CentOS 5.5, in this case comes with Python 2.4.3, which is pretty long in the tooth, so I wouldn’t bother with it. Even if it was a modern Python, I tend to shy away from doing much with the system Python, because it’s the system python, and I don’t want to do things and break my system. You might be OK with using a virtualenv of your system Python to get some isolation or you might do like I did in this case and simply install a fresh new version of Python with no connection whatsoever to the system Python. In fact, I installed a new Python and created a virtualenv in it specifically for the Django stuff I was doing.

Install mod_wsgi

Depending on your OS and package repositories, it might be an apt-get install or yum install away. For me it wasn’t and I downloaded and built it from source. Make sure that mod_wsgi is compiled against the Python version that you actually want to use. It’s pretty easy for it to link against your system Python or some other Python you don’t intend to use and that will cause problems. I built mine like this:

PATH=/usr/local/bin:/usr/bin:/bin ./configure --with-python=/usr/local/bin/python
LD_RUN_PATH=/usr/local/lib make
sudo make install

I needed the LD_RUN_PATH stuff so that mod_wsgi.so could link with /usr/local/lib/libpython2.7.so.1.0 at runtime. This installed mod_wsgi.so as /usr/lib64/httpd/modules/mod_wsgi.so (which is also available at /etc/httpd/modules/mod_wsgi.so because I already had a symlink from /etc/httpd/modules to /usr/lib64/httpd/modules.

Turn on mod_wsgi in the Apache config

For me, I had to do:

$ cat /etc/httpd/conf.d/wsgi.conf 
LoadModule wsgi_module modules/mod_wsgi.so

Important: Turn off mod_python!

This is sort of commonly-known wisdom for people who are in the know about mod_wsgi, but I don’t set up mod_wsgi very often and so I forgot about it. And it caused great pain, because it caused a very mysterious failure. Basically, I had a state where a “Hello World” WSGI app worked just fine, but when I tried to use the wsgi.py of my Django site, it would crap out silently. No errors in the Apache error log even with LogLevel set to debug. Through some ugly hacking of the wsgi.py and some files in Django itself, I could see that it was silently dying inside Django when it got to the statement:

from threading import Lock

I did a bunch of searching and found posts of people having similar troubles but I didn’t find an answer.

And then I remembered to try turning off mod_python and that worked! Which was good, but it burned a lot of my time and psychic energy. All I had to do was comment out the following line in /etc/httpd/conf.d/python.conf:

# LoadModule python_module modules/mod_python.so

I wonder if mod_wsgi could be modified to detect the presence of mod_python and if it finds it, blast the logs with warnings? Frankly, I don’t know enough about Apache modules to know if this is possible.

Set up an Apache config that points to the wsgi.py

Mine looks roughly like this:

WSGIPythonHome /www/python/pypi.venv
WSGIPythonPath /www/python/django
WSGIPassAuthorization On

<VirtualHost *:80>
...
    DocumentRoot /www/python
    WSGIScriptAlias /djangopypi /www/python/django/djangopypisite/wsgi.py
...
</VirtualHost>

This of course will have to be heavily customized depending on whether you’re using a virtualenv and where it is and where your app is located, etc.

The WSGIPassAuthorization On line is something that I didn’t have at first and only later ran into problems and ended up adding it…

Test it out…

At this point, I had the Django app working more or less the same as it was working in the Django runserver except…

Make sure directories are writable by Apache, etc.

If you set up everything initially as your own user and then moved it over to mod_wsgi, then you might have files that belong to your user and which are not writable by Apache. Log directories, SQLite databases, directories for uploaded files (typically called “media” in Django paralance), etc.

Setting up static serving

The Django runserver makes things easy by serving static files for you. Django when deployed via mod_wsgi won’t serve static files by default. There are probably hacks to make it do that, but if you’re already running Apache it makes more sense to just have Apache serve those static files. That is certainly what the Django guys encourage. I set the MEDIA_ROOT and MEDIA_URL settings in settings.py and then did some symlinking and Apache configuration so that Apache could serve the files out of /www/python/media. And I created a directory /www/python/static for stuff like CSS and JS files and collected them from the apps into this directory using python manage.py collectstatic.

WSGIPassAuthorization On: The last piece of the puzzle

At this point, I had done a lot of hacking and got it mostly working. The one missing piece was that I could not upload packages to my custom PyPI server (i.e.: with python setup.py register -r chegg sdist upload -r chegg). Actually it worked when I had my ~/.pypirc pointed to the Django runserver; it just didn’t work with the mod_wsgi version — it failed with Upload failed (401): UNAUTHORIZED. This took a while to figure out, but I eventually found the reason and answer here. Basically this is because djangopypi does checking of the HTTP Authorization header in order to see if you’re authorized, and mod_wsgi by default filters out this header as a security precaution, on the assumption that you would let Apache do the authorization check and not involve the WSGI apps in this. I imagine that this is something useful for providers of shared web hosting, but it obviously will not work if you have WSGI apps doing their own authorization, as I had in this case with djangopypi. The answer was to turn on WSGIPassAuthorization by adding WSGIPassAuthorization On to my Apache vhost config.

That’s all folks!

So now it’s working although it took annoyingly long to set up.

The temptation was to move on as quickly as possible to something else now since I spent more time than I wanted to on this. Surely I’m now a smarter, wiser person who learned and won’t run into these problems next time… On second thought, realistically I may not have to do this again for months…or years…and I know myself well enough to know that I probably won’t remember much of this in a year. So I decided to invest just a little more time in writing down some of the high-level points so that I (or someone else) can refer to them later. And now because I blogged it, a few months from now, I will only have to remember to search my blog to find this post… :-)

Have a conversation once or have it over and over forever

When a team is collaborating to create software, there are inevitably issues that come up in “big picture” areas like architecture and process. There are two ways to deal with these:

  1. Take some time out from the daily grind and discuss the issue until consensus is reached and a decision is made (and hopefully documented!) for how to handle stuff going forward.
  2. Don’t take the time to think (“We don’t have the luxury of time for thinking!”) and continually have mini-discussions about it (sometimes with other people; sometimes internally within people’s minds) every time it comes up again forever and ever…

I know which one I prefer. How about you…?

There is no reason ever to have the same thought twice,
unless you like having that thought” – David Allen

Carbonite woes

I’ve been using Carbonite for a while and it’s worked well most of this time (though how do you know if a backup tool is working?), but recently I’ve started to notice a few issues with it and I may end up switching to another cloud backup service (I’m currently evaluating CrashPlan and another part of me is seriously considering going old-school and using good ol’ rsync or maybe just cloning to a bunch of spare hard drives and then leaving them with friends and family). Some if these issues might not be serious, but when it comes to a backup tool, it makes me nervous when I see weird unexplainable behavior. I like my backup software to just work. And that is why I’m detailing everything weird I saw, even if some of it is the inevitable stuff that you get when you peek too closely into the sausage factory.

Issue 1: Backups can stop progressing; no alerts to tell me this

A couple of weeks ago, I noticed that Carbonite had been “stuck” – there were about 1500 files that were pending backup and that number was not shrinking.

I happened to notice this because we had taken a lot of photos and I decided to check whether they were getting backed up. If I had not checked and our hard drive crashed, that could’ve been really bad.

This stuck backup lead to me spending quite a few hours measuring, diagnosing, disk repairing, and tweaking and it looks like things are working better now (I have only 42 files pending backup now). It also lead to the discovery of other issues that are detailed below. But the main point I want to make here is that I should not have to notice that backups have stalled. Carbonite should detect that it is not progressing and send me an alert email and/or popup on my screen.

Issue 2: Carbonite not backing up because CPU Idle is too low (and Carbonite is using the bulk of the CPU)

This is possibly one of the big reasons for the Issue 1.

When my backup wasn’t progressing, I looked for clues by looking at /Library/Application Support/Carbonite/Data/Carbonite.log and noticed that CPU Idle was almost never at the CPU Idle target of 75%. I’m wondering if this is causing Carbonite to not back up? I tried quitting every program on my computer (including all sorts of menu bar apps and background apps such as Cloud.app, Dropbox, Evernote, Jenkins, etc). The only thing that is using a lot of CPU, according to Activity Monitor, is Carbonite itself.

At this point, I started looking at what Carbonite was looking at using the fs_usage command.

Ideally, I shouldn’t have to quit every program on my computer in order to get the CPU usage low enough for Carbonite to be able to do backups. The beauty of a cloud backup app like Carbonite is that it just should just do its thing in the background without me having to change my workflow. When it’s working well, this is how it works. I’m not sure of why it has been stumbling lately.

Issue 3: Internal program errors in the logs

I also noticed some internal errors in the log and I wonder if Carbonite is crashing when it tries to back up certain files and thus is unable to make progress – e.g.:

1346811842  = 19:24:02
1346811842  @ 2954915840: Backup progress: completed 38%, file counts last 2 22:10:34 All: 3617 (1081M bytes) Unique: 3225 (731M bytes), Compressed: 848M bytes.
1346811842  @ 2954915840: Backup progress: remaining time unknown, file counts All: 563085 (217G bytes), Pending: 1396 (1767M bytes).
1346811842  @ 2954915840: Backup progress: task seconds last minute, Active: 0 (0%) Response Waiting: 61 (100%) Send Waiting: 0 (0%) Sending: 61 (100%) 
1346811873  ! 2957045760: Internal program error: ASSERT "LogFile.SetFilePointer(dwBufferPosition)" (/Library/Bamboo/home/xml-data/build-dir/MAC-CLI-JOB1/Client/Mac/daemon/../../../Shared/Common/LogMsg.cpp:735)
1346811873  @ 2957045760: DumpStack: stack trace not available
1346811873  ! 2957045760: Internal program error: ASSERT "LogFile.SetFilePointer(dwBufferPosition)" (/Library/Bamboo/home/xml-data/build-dir/MAC-CLI-JOB1/Client/Mac/daemon/../../../Shared/Common/LogMsg.cpp:1255)
1346811903  = 19:25:03

This does not inspire confidence.

Issue 4: Delaying backup due to zero length file modified within the last minute

Carbonite says that it’s “delaying backup due to zero length file modified within the last minute” – the issue is that these files are zero-length but they have not been modified within the last minute. For example:

1347897240  # 2956513280: Delaying backup due to zero length file modified within the last minute: "/Volumes/Momentus/Users/marc/Library/Preferences/MaxBack .mxbk".
1347897240  # 2956513280: Delaying backup due to zero length file modified within the last minute: "/Volumes/Momentus/Users/marc/Dropbox/Camera Uploads/Icon ".
1347897240  # 2956513280: Delaying backup due to zero length file modified within the last minute: "/Volumes/Momentus/Users/marc/Library/Application Support/Evernote/accounts/Evernote/msabramo/note.index/write.lock".
1347897240  # 2956513280: Delaying backup due to zero length file modified within the last minute: "/Volumes/Momentus/Users/marc/Library/Application Support/Google/Chrome/Default/Extension State/LOCK".

But the apps that write these files are not running and these files have not been modified recently at all.

$ date
Mon Sep 17 09:19:25 PDT 2012

$ ls -l "/Volumes/Momentus/Users/marc/Library/Preferences/MaxBack .mxbk"
-rw-r--r--  1 marc  marc  0 Sep 11 02:40 /Volumes/Momentus/Users/marc/Library/Preferences/MaxBack .mxbk

$ ls -l "/Volumes/Momentus/Users/marc/Dropbox/Camera Uploads/Icon^M" 
-rw-r--r--@ 1 marc  marc  0 Sep 10 23:28 /Volumes/Momentus/Users/marc/Dropbox/Camera Uploads/Icon?

$ ls -l "/Volumes/Momentus/Users/marc/Library/Application Support/Evernote/accounts/Evernote/msabramo/note.index/write.lock"
-rw-------  1 marc  marc  0 Sep 16 22:23 /Volumes/Momentus/Users/marc/Library/Application Support/Evernote/accounts/Evernote/msabramo/note.index/write.lock

$ ls -l "/Volumes/Momentus/Users/marc/Library/Application Support/Google/Chrome/Default/Extension State/LOCK"
-rw-------@ 1 marc  marc  0 Sep 10 06:07 /Volumes/Momentus/Users/marc/Library/Application Support/Google/Chrome/Default/Extension State/LOCK

This may or may not be affecting my backup appreciably, but the issue here is that when I’ve already had some difficulty getting Carbonite to do its backups, then when I look through the logs and notice other thing that seem incorrect, it further erodes my confidence in the product.

Issue 5: Carbonite wasting its time repeatedly checking files and directories that have long been deleted

Carbonite doesn’t log every single file it looks at (Thank goodness! That would be extremely inefficient.), so most folks wouldn’t even notice this one. I only did because my backup was not progressing and I kept seeing messages in the logs saying that Carbonite was waiting because CPU usage was too high, and Carbonite was the only app using a lot of CPU, so I went poking around to see what it was so busy doing.

$ sudo fs_usage | grep '/.*CarboniteDae'

08:33:34  getattrlist       //Users/marc/etsylister/parts>>>>>>>>>>>>>>>                                     0.000038   CarboniteDae
08:33:34  getattrlist       //Users/marc/etsylister/parts>>>>>>>>>>>>>>>                                     0.000007   CarboniteDae
08:33:34  getattrlist       //Users/marc/etsylister/parts>>>>>>>>>>>>>>>                                     0.000008   CarboniteDae
...

I cut this off at 3 entries, but in reality, I saw this 44 times in one second (and it happened for more than one second, but I stopped counting). This also isn’t the only directory that it’s continuing to check – there are several others.

The fact that an app is repeatedly accessing the same file over and over 44 times per second is concerning in itself. To add fuel to the fire, this directory no longer exists! See, I noticed this message many times in the past and I thought that I could give Carbonite a hand and get it past its compulsive checking by deleting this directory and a bunch of other ones that it was compulsively checking. So maybe a week ago, I went on a rampage and blew away a bunch of directories that Carbonite was obsesssing over that weren’t important to me. Problem is, Carbonite is still checking them! And this is even after I uninstalled Carbonite, blew away all its preferences and support files, rebooted, and reinstalled Carbonite. Either I missed deleting the file on my local filesystem that carries this state, or, more likely, the knowledge of this file is in Carbonite’s servers. Perhaps after some amount of time or some number of failed attempts to access the file (how many?), Carbonite will forget about these files. But in the meantime, I worry that it’s wasting its time checking stuff that’s gone and it could be better spending its time backing up the files that I do have.

Update 2012-09-27

I’ve accumulated the info in this post over the course of the last several weeks. At one point, I did a bunch of fiddling including a disk repair on my startup disk and managed to get Carbonite to make a bunch of progress so that only 70 or so files were pending backup (it’s quite possible that disk errors were confusing Carbonite). Still I wanted them all to be backed up or at least know which files those were that were not being backed up and why. The latest is that I talked to Carbonite support and had a fairly nice and helpful guy connect to my machine remotely and make Carbonite exclude some files that are supposedly problematic (browser caches and the like) and mess around with some Carbonite caches. This didn’t really make Carbonite catch up and back up the remaining 70 files. In fact, over the course of the last few days that number has grown to a little over 300. This suggests to me that Carbonite is having trouble with some of these files and is “stuck” and will likely fall further and further behind but I don’t know which files are the problematic ones and Carbonite support hasn’t yet been able to tell me which files these are.

Most likely, I will let my Carbonite subscription lapse when it comes up for renewal. I might go with CrashPlan, but I think more likely is that I will forego cloud backup services entirely and instead just clone my stuff to hard drives and store a couple in some offsite locations like the homes of friends and family. The knocks against cloud backup:

  • Hard to know whether it’s working – how do I really know which files the provider has and which they don’t?
  • Cloud backup uses a ton of my computer CPU and Internet and network bandwidth for months and months. When I do do backups myself, I kick it off and after a few hours it’s done.
  • When the time comes to restore files, I will have to download a tremendous amount of data or pay over a hundred dollars for them to ship me a hard drive. Even then, the hard drive or files downloaded will just be files; not a bootable image. Conversely if I go to a friend’s house and pick up a hard drive with a clone of my system, I have all of my files fast and I have something I can boot off of.
  • I could save the money on the subscription fee and use it towards my hard drive budget.

What is up with Mail.app and Exchange?

Fired up Mail.app on my work laptop today for the first time in a while.

Outlook 2011, OWA (the Outlook Web interface) and my iPhone all show the same state of my inbox, with a single message in it.

Mail.app shows two messages – the one that the other guys show and one that I filed many hours ago.

Things I tried (all of which failed):

  1. Telling Mail.app to resynchronize
  2. Telling Mail.app to rebuild the mailbox
  3. Restarting Mail.app
  4. Deleting my Exchange account and adding it back (and the rebuilding the mailbox for good measure)
  5. Quitting Mail.app, moving my ~/Library/Mail folder to somewhere else, Start Mail.app, go through the wizard to see up my Exchange account all over again.

Nope. Still shows those blasted two messages. The last one is rather amazing to me. If I moved the folder out of the way, that seems to me like it would be pretty much starting from scratch. If that didn’t work, I can only guess that either the OS is doing some caching that I don’t know about or the message is still on the server and Mail.app is just interpreting the server state didn’t from all the others. Strange.

It still amazes me how much trouble we all have with computers with pretty basic stuff like email, calendars, and addressbooks, and getting it all to sync. I hate crap like this, because it breaks the trust that I have in computers to do the simple stuff for me. But then if I can’t trust it, I don’t know, I might as well go back to pen and paper. Imagine how much time we could free up as a culture if we didn’t have to deal with Exchange, IMAP, ActiveSync, SyncServices, and all of that other crap that sounds nice but then fails in strange ways.