Battle of the Bands: Stars vs. Megadeth

Tonight before dinner, I happened to put on the sublimely beautiful album Set Yourself on Fire (2004), by Stars. The opening notes have always struck me as a little ... off. A gravelly voice intones, "When you have nothing left to burn, you have to set yourself on fire." Wow. That's pretty intense. What kind of full-frontal musical assault is the answer to that?

Well, in Stars' world, full-frontal musical assaults just aren't the done thing. Think strings, piano, soft horns. Watch the video to hear for yourself.

Then, after dinner, I happened to put on Megadeth's relentless 1988 album So Far, So Good ... So What! And then it hit me. All of a sudden, I realized why the string/piano/soft horn intro always felt wrong. It's because Megadeth did a similar shtick 15 years earlier, and they did it right.

Folks, I don't care if you're writing about setting yourself on fire or setting the world on fire, but get it right:

Don't get me wrong: I think Stars are a fine band, and Set Yourself on Fire is a brilliant, beautiful, wonderful album. It's a perfect example of mid-2000s art pop. If you like thinking person's pop music, if you like melody and lyrics and talented musicians making it all look effortless, you should just go buy the album now. This is far and away the best album by Stars. You can't go wrong.

On the other hand, if you're more into full-frontal musical assaults, it's awfully hard to beat So Far, So Good... So What!. I'm only familar with Megadeth's first couple of albums (I lost interest after Rust in Peace), but this is the best of the early stuff. From start to finish, it does not let up. And it certainly doesn't drag: the whole album clocks in at just under 35 minutes. They don't waste a note. If you've ever wished a talented metal band would channel the raw energy and attitude of punk: it's been done, right here, and Megadeth did it better than anyone I can think of. They cover the Sex Pistols, they put you right in the shoes of a suicidal loser ("oh, how I lived my life for you / now, as I die, my flesh still crawls as I breathe your name"), and they utterly excoriate the risible censorship of the PMRC (in case you've never heard of them, it was an eighties thing).

That said, So Far is undeniably topical, political, and of its time. Maybe I like it so much because I was 16 or 17 when I first bought it, and it just stuck its hook in me. Maybe the kids today would find it dated. Set Yourself on Fire is timeless pop, and only the style and instrumentation give away its time and place (Montreal, 2004). The lyrics will be as relevant, and as poignant, in 20 years as they are today.

I love that two albums can be so utterly different, and yet both so great.

Author: Greg Ward
Published on: Sep 23, 2015, 10:16:34 PM
Permalink - Source code

Performance Penalty of Python Exceptions

The other day at work, a colleague and I were discussing the relative merits of indicating an error in Python by raising an exception or by returning a null/empty/false value. It boiled down to this implementation:

def get_something(url, what):
    response = requests.get(url, params={'id': what})
    if not requests.body:
        return None
    return _parse_body(response.body)

versus this:

def get_something(url, what):
    response = requests.get(url, params={'id': what})
    if not requests.body:
        raise SomeError('empty response from %s' % url)
    return _parse_body(response.body)

(Yes, I'm ignoring network and HTTP errors.) Assume the service behind url is not supposed to return an empty response, and we want to notice if it does. In both cases, it's easy for the caller to detect this condition. But which way is better?

My colleague argued for returning None (or an empty dict) on the grounds that “if is faster than try”. Hmmm. I've always assumed that raising an exception has a considerable performance penalty, but have never considered that there might be a penalty to catching an exception that isn't there. In other words: exceptions are supposed to be rare, so it's pointless to worry about the performance hit from raising one. But is there a performance hit from simply enclosing code in a try/except?

Let's try it and see.

No errors

First, let's implement a query function that never fails, just to establish a baseline. This serves two purposes: it lets us see how much faster it is to completely ignore errors, and it shows the overhead of an if or try where we never take the error path.

# v1: cannot fail
def get_something_1():
    if random.random() < 0:
        raise AssertionError()
    return 42

Now we'll call this cannot-fail query three ways:

# v1: ignore errors
def try_something_1():
    return get_something()

# v2: detect errors with "if"
def try_something_2():
    something = get_something()
    if something is None:
        pass
    return something

# v3: detect errors with "try/except"
def try_something_3():
    try:
        return get_something()
    except SomeError:
        pass

Note that I'm deliberately doing nothing on the error path. I'm not interested in the overhead of handling the error, just in the overhead of try/except vs. if.

I ran this code using the timeit module from Python's standard library, using 5 repetitions of 5m runs:

get_something_1, try_something_1: best of 5: 1.742 s     # no errors, ignore errors
get_something_1, try_something_2: best of 5: 1.813 s     # no errors, detect with if
get_something_1, try_something_3: best of 5: 1.709 s     # no errors, detect with try/except

Divide by 5m and you can see we're looking at around 0.35 µs per call. These particular numbers are from Python 2.7.9 under Ubuntu 15.04 with a 3.33 GHz Intel Core i5 CPU. Python 3 was a bit faster, but the overall pattern is the same.

So, using if is slightly slower than try/except when we never hit the error path. Ignoring errors entirely is about the same as try/except.

Conclusion so far: if is not faster than try/except. If anything, it's a bit slower. That might change if we reordered the code so the happy path comes first (I didn't try it).

Rare errors

A more realistic scenario is when your query occasionally fails. Here are two versions of the same query function, both rigged to fail 0.1% of the time:

# v2: rare failure (0.1% probability) by returning None
def get_something_2():
    if random.random() < 0.001:
        return None
    return 42

# v3: rare failure by raising SomeError
def get_something_3():
    if random.random() < 0.001:
        raise SomeError()
    return 42

It no longer makes sense to exercise these versions with try_something_1(), which ignores errors. In fact, we can only exercise get_something_2() with try_something_2(), and similarly for get_something_3(). So now we're measuring the overhead of both reporting an error and detecting it:

get_something_2, try_something_2: best of 5: 1.771 s     # rare errors, detect with if
get_something_3, try_something_3: best of 5: 1.646 s     # rare errors, detect with try/except

The system runs at about the same speed when 1 in 1000 queries follows a different code path. Surprisingly, try/except is still winning by a hair. If there is an overhead to raise, it's not noticeable yet.

Conclusion: when errors are infrequent, if is not faster than try/except. Again, reordering the code might matter.

Frequent errors

Finally, let's modify the query function so that errors are frequent:

# v4: frequent failure (30% probability) by returning None
def get_something_4():
    if random.random() < 0.3:
        return None
    return 42

# v5: frequent failure by raising SomeError
def get_something_5():
    if random.random() < 0.3:
        raise SomeError()
    return 42

Since the error-reporting semantics are the same, we can stick with try_something_2() and try_something_3() to exercise these two. Results:

get_something_4, try_something_2: best of 5: 1.806 s     # frequent errors, detect with if
get_something_5, try_something_3: best of 5: 3.195 s     # frequent errors, detect with try/except

Ah-ha! Now we're on to something. The version using if stayed about the same, but raising an exception on 30% of queries caused a dramatic slowdown. So raise does have noticeable overhead if it happens often enough.

Source code

See https://gerg.ca/blog/attachments/try-except-speed.py.

Conclusion

It's safe to say that if is definitely not faster than try. That's clearly a false assumption, which is what I was expecting to find.

In fact, if might be slower than try, although I suspect minor tweaks to the code (keep the happy path adjacent in memory) might make a difference. It's worth trying.

And raise does have a noticeable overhead, but only if errors are frequent. Another interesting experiment to do with this code is find out how frequent errors have to be before the overhead of raise is noticeable.

That said, if 30% of some operation results in an error, you probably have bigger problems than the overhead of raising an exception. You might want to look into more reliable infrastructure. Just sayin'.

Author: Greg Ward
Published on: Sep 18, 2015, 12:06:42 PM - Modified on: Sep 19, 2015, 4:40:36 PM
Permalink - Source code

Installing Calibre on Linux without root

Calibre is a great piece of software for managing e-books. Not only is it great, it's open source. What's not to like?

OK, there is one thing not to like about Calibre: packaging and installation. The project download page actually advises against using OS-provided packages on the grounds that they are "often buggy/outdated". And instead of providing a normal download-and-install process, Calibre expects you to download an unknown Python script which you then run as root.

Sorry, Calibre, but no. I'm not going to give you complete access to my computer to do anything you want with it. You are free to copy files into a directory of my choosing, but beyond that, I'm in charge. It's my computer, not yours.

So here's how I got calibre's installation script to behave according to my wishes.

  1. Download the installation script:

    $ wget -nv https://raw.githubusercontent.com/kovidgoyal/calibre/master/setup/linux-installer.py
    
  2. Run the installation script such that it installs to /tmp/calibre:

    $ python -c 'execfile("linux-installer.py"); main(install_dir="/tmp")'
    
  3. Move the installation to its permanent home:

    $ sudo mv calibre /usr/local/calibre-$VERSION
    
  4. Make sure the shell can find "calibre" (first time only):

    $ cd /usr/local
    $ sudo ln -s calibre-$VERSION calibre
    $ sudo ln -s ../calibre/calibre bin/.
    

For future upgrades, repeat steps 1-3. Then you just have to replace the /usr/local/calibre symlink:

$ cd /usr/local
$ sudo rm calibre
$ sudo ln -s calibre-$VERSION calibre

and the existing /usr/local/bin/calibre symlink will now point to the new version. If the upgrade is a failure, rolling back to the previous version is trivial.

Author: Greg Ward
Published on: Aug 30, 2014, 3:38:12 PM
Permalink - Source code

ZeroMQ: poll() and wait for child process

At work, I've been hacking on a distributed system based on ZeroMQ. It's a nice library that hides away a lot of the fuss and bother of network programming, but still exposes enough detail that you have a lot control over what's going on. However, the main mechanism for multiplexing I/O events is zmq_poll(), which only accepts 0MQ sockets or regular file descriptors. But if you're doing network I/O while running some work in child processes, you might want to block until some socket is ready or some child process has terminated. How to do this with zmq_poll() is not immediately apparent.

As it turns out, there are several nice ways to solve this problem, in both C and Python.

Parent and child

First, here's the setup: a parent process with one child at a time, where the child can terminate at any time. I'll show the Python version, since it's less verbose than the equivalent C:

#!/usr/bin/python

# combine 0MQ poll with child process

from __future__ import print_function

import sys
import os
import signal
import time
import random
import errno

import zmq

def main():
    # open a 0MQ socket that nobody ever connects to: the point
    # is not to do network I/O, but to poll with a child process
    # running in the background
    context = zmq.Context()
    resp = context.socket(zmq.REP)
    resp.bind("tcp://*:5433")
    poller = zmq.Poller()
    poller.register(resp)

    # start a child process that runs in the background and terminates
    # after a little while
    spawn_child()

    # main loop: this is where we would do network I/O and -- if
    # only we could figure out how -- respond to the termination of
    # the child process
    while True:
        print('parent: poll(5000)')
        poller.poll(5000)
        print('parent: poll() returned')

def spawn_child():
    pid = os.fork()
    if pid > 0:           # in the parent
        return

    # in the child: sleep for a random interval (2 .. 9 sec)
    stime = random.randint(2, 9)
    print('child: sleep(%d)' % stime)
    time.sleep(stime)
    print('child: exiting')
    os._exit(0)

main()

The child process here doesn't actually do anything (like execute a command), as that would distract from the purpose of the exercise.

This version illustrates the problem, but makes no attempt to solve it: if you run the program, you can clearly see when the child starts and exits. But since the child almost certainly terminates while the parent is in poll(), the parent doesn't know what happened. The parent could alternate calls to poll() with os.waitpid(), but then network activity could arrive while we're in waitpid(), and it would not respond to that immediately. One naive temptation is to alternate calls to poll() and waitpid() with as short a timeout as possible: the shorter the timeout, the better the response time—but that ends up as a CPU-bound busy loop rather than a low-overhead event-driven program. Alternating poll() with waitpid() is not the answer. There has to be a better way.

The classic: SIGCHLD with an interrupted system call

The classic Unix answer is SIGCHLD, the signal that is delivered to a parent process when one of its children terminates. You normally don't have to worry about SIGCHLD, since it's one of the few Unix signals that are ignored by default. But you're free to install a signal handler for it so you can do stuff as soon as a child terminates.

The nifty thing about signals is that they interrupt system calls, and ZeroMQ's poll() boils down to system calls. If ZeroMQ were trying to be clever (too clever by half), it might catch those errors and retry the poll() system call. Good news: ZeroMQ does not try to be clever. It does the right thing and exposes the interrupted system call to application code. (Pro tip: don't try to be clever. Just expose system errors to your caller with as little manipulation as possible. Your callers will thank you in the end. Thank you, ZeroMQ!)

So here's how it looks. First, the signal handler:

child_terminated = False

def handle_signal(sig, frame):
    global child_terminated
    child_terminated = True

There's very little you can do safely inside a signal handler, especially in Python. Assigning a constant to a global variable is about the only guaranteed safe action. (The reason is that just about anything you do might allocate memory with malloc(), and the signal might arrive while another malloc() call is already running. You cannot assume that the memory allocation system is re-entrant. Thus, you must avoid anything that might call malloc(), which in Python is just about anything.) (The other thing you don't want to do is anything that might block, like reading from a socket or even writing to a local disk file. “Local” disk files always turn out to be on failing disks or flaky NFS servers at just the wrong time.)

Next, just inside main(), we install the signal handler:

def main():
    signal.signal(signal.SIGCHLD, handle_signal)
    [...as before...]

As a first attempt, let's modify the main loop to check that child_terminated flag when poll() returns. After all, we expect poll() to block for 5000 ms or be interrupted by SIGCHLD, so we should get a quick reaction to the child process terminating:

while True:
    print('parent: poll(5000)')
    poller.poll(5000)
    if child_terminated:
        print('child terminated')
        child_terminated = False
        spawn_child()
    #print('parent: poll() returned')

Here's what happens with this version:

parent: poll(5000)
child: sleep(2)
child: exiting
Traceback (most recent call last):
  File "pollchild.py", line 58, in <module>
    main()
  File "pollchild.py", line 51, in main
    poller.poll(5000)
  File "/usr/lib64/python2.7/site-packages/zmq/sugar/poll.py", line 97, in poll
    return zmq_poll(list(self.sockets.items()), timeout=timeout)
  File "_poll.pyx", line 116, in zmq.core._poll.zmq_poll (zmq/core/_poll.c:1598)
  File "checkrc.pxd", line 21, in zmq.core.checkrc._check_rc (zmq/core/_poll.c:1965)
zmq.error.ZMQError: Interrupted system call

Ooops! Interrupting a system call is treated as an error in the interrupted application code! Looks like we need to catch that error. Here is the revised main loop:

while True:
    print('parent: poll(5000)')
    try:
        poller.poll(5000)
    except zmq.ZMQError as err:
        if err.errno != errno.EINTR:
            raise
    if child_terminated:
        print('child terminated')
        child_terminated = False
        spawn_child()

This one works just fine:

parent: poll(5000)
child: sleep(3)
child: exiting
child terminated

There's no visible delay between "child: exiting" and "child terminated". The parent responds immediately to child termination, just as it would if any network activity arrived on the ZeroMQ socket(s) that it's polling.

You may have noticed that I took advantage of the parent's newfound knowledge to do something new: start another child process. This guarantees that there is pretty much always one child running, except during the brief interval between one exiting and the next starting. There will definitely never be more than one child, which is why we can get away with just a single child_terminated flag. Real life is never that simple, of course.

The classic, in C

If you appreciate classic Unix tricks like SIGCHLD and interrupted system calls, then surely you will appreciate seeing the same thing again in C. Here it is:

int child_terminated = 0;

void handle_signal(int sig) {
    child_terminated = 1;
}

void spawn_child(void) {
    pid_t pid = fork();
    if (pid < 0) {
        perror("fork");
        exit(1);
    }
    if (pid > 0) {
        return;
    }

    // sleep from 2 .. 9 sec
    srandom(getpid());
    int stime = (random() % 7) + 2;
    printf("child: sleep(%d) ...\n", stime);
    sleep(stime);
    printf("child: exiting\n");
    exit(0);
}

int main(void) {
    // install SIGCHLD handler
    struct sigaction action;
    action.sa_handler = handle_signal;
    sigemptyset(&action.sa_mask);
    action.sa_flags = 0;
    if (sigaction(SIGCHLD, &action, NULL) < 0) {
        perror("error installing signal handler");
        return 1;
    }

    // setup a 0MQ socket waiting for incoming TCP connections
    void *context = zmq_ctx_new();
    void *resp = zmq_socket(context, ZMQ_REP);
    int rc = zmq_bind(resp, "tcp://*:4522");
    if (rc != 0) {
        perror("zmq_bind (tcp://*:4522) failed: ");
        return 1;
    }

    // build list of things that we want to poll on
    int nitems = 1;
    zmq_pollitem_t items[] = {{resp, -1, ZMQ_POLLIN, 0}};

    spawn_child();

    while (1) {
        printf("poll() for 5 s ...\n");
        if (zmq_poll(items, nitems, 5*1000) < 0) {
            if (errno != EINTR) {
                perror("zmq_poll");
                exit(1);
            }
        }
        if (child_terminated) {
            printf("child terminated\n");
            child_terminated = 0;
            spawn_child();
        }
    }
}

A curious phenomenon: even though Python is typically much more concise than equivalent C code, that's not the case here: 70-odd lines of C versus 60-odd lines of Python, despite the need for explicit error checking in C. I find this often happens with low-level system programming. The closer you get to the OS, the smaller the benefit of using a high-level language.

The modern twist: signalfd() (Linux only)

It turns out that the classic Unix signal API is a bit awkward to use (read the sigaction(2) man page if you don't believe me). Perhaps you would rather deal with one abstraction than two, and file descriptors are a more general abstraction—and they're what poll() works with. Likewise zmq_poll() works with both 0MQ sockets and file descriptors.

It turns out that recent versions of Linux (2.6.22 and up) have a non-standard system call, signalfd(), that exposes a file-like interface to signal handling. Instead of installing a signal handler with sigaction(), you create a signal file descriptor with signalfd(). The setup is roughly as awkward as installing a signal handler, but once you have that file descriptor, things are a little neater: no more worrying about interrupted system calls.

There's one new header to include:

#include <sys/signalfd.h>

The implementation of spawn_child() doesn't change, so I'll skip over that. But pretty much everything in main() changed a bit, so here's the signalfd()-based version of main():

int main(void) {
    // build the list of signals that we're interested in (just SIGCHLD)
    sigset_t mask;
    sigemptyset(&mask);
    sigaddset(&mask, SIGCHLD);

    // block SIGCHLD from being handled in the normal way
    // (otherwise, the signalfd does not work)
    if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) {
        perror("sigprocmask");
        return 1;
    }

    // create the file descriptor that will be readable when
    // SIGCHLD happens, i.e. when a child process terminates
    int sigfd = signalfd(-1, &mask, 0);
    if (sigfd == -1) {
        perror("signalfd");
        return 1;
    }

    // setup a 0MQ socket waiting for incoming TCP connections
    void *context = zmq_ctx_new();
    void *resp = zmq_socket(context, ZMQ_REP);
    int rc = zmq_bind(resp, "tcp://*:4522");
    if (rc != 0) {
        perror("zmq_bind (tcp://*:4522) failed: ");
        return 1;
    }

    // build list of things that we want to poll on
    int nitems = 2;
    zmq_pollitem_t items[] = {
        {resp, -1, ZMQ_POLLIN, 0},
        {NULL, sigfd, ZMQ_POLLIN, 0},
    };

    struct signalfd_siginfo siginfo;

    spawn_child();

    while (1) {
        printf("poll() for 5 s ...\n");
        int rc = zmq_poll(items, nitems, 5*1000);
        if (rc < 0) {
            perror("zmq_poll");
            exit(1);
        }
        if (items[0].revents & ZMQ_POLLIN) {
            printf("0MQ messages received\n");
        }
        if (items[1].revents & ZMQ_POLLIN) {
            printf("child process terminated\n");
            ssize_t nbytes = read(sigfd, &siginfo, sizeof siginfo);
            if (nbytes != sizeof siginfo) {
                perror("read(sigfd)");
                return 1;
            }
            spawn_child();
        }
    }
}

As you can see, the initial overhead to create one file descriptor that exposes one signal is a bit more than installing a traditional signal handler. The 0MQ stuff is the same, except that now we're passing a list of two items to zmq_poll()—and one of them is a system file descriptor rather than a 0MQ socket. That of course is the key change. Finally, interpreting the outcome of zmq_poll() is totally different. Errors are errors, period: we make no exceptions for EINTR. Instead, socket activity and SIGCHLD both appear as readable things: a 0MQ socket that can recv() a message, or a file descriptor that we can read() from. (Note that it's essential to actually call read() on the signalfd() file descriptor: until you do that, the signal remains pending, the file descriptor remains readable, and zmq_poll() returns immediately.)

The disadvantage of this approach is portability: it only works with Linux 2.6.22 and later. More subtly, it only works with programming languages that expose signalfd().

signalfd() in Python

Unfortunately, current versions of Python (2.7.5 and 3.3.2 as I write this) do not expose signalfd() in the standard library. Luckily, Jean-Paul Calderone has written a wrapper, which you'll find in PyPI (https://pypi.python.org/pypi/python-signalfd). The documentation is a bit lacking and the API not quite complete, but I got it to work.

I installed it with

pip install --user python-signalfd

(You'll want to leave out the --user option if you're using a virtualenv.)

As with the C version, most of the changes are in main(). There's no more signal handler and no more child_terminated global variable.

Here's the code:

def main():
    # list of signals that we're interested in (just SIGCHLD)
    mask = [signal.SIGCHLD]

    # block SIGCHLD from being handled in the normal way
    # (otherwise, the signalfd does not work)
    signalfd.sigprocmask(signalfd.SIG_BLOCK, mask)

    # create the file descriptor that will be readable when
    # SIGCHLD happens, i.e. when a child process terminates
    sigfd = signalfd.signalfd(-1, mask, 0)

    # setup a 0MQ socket waiting for incoming TCP connections
    context = zmq.Context()
    resp = context.socket(zmq.REP)
    resp.bind("tcp://*:5433")

    # things we want to poll() on
    poller = zmq.Poller()
    poller.register(resp)
    poller.register(sigfd)

    spawn_child()

    while True:
        print('parent: poll(5000)')
        ready = poller.poll(5000)
        for (thing, flags) in ready:
            if flags & zmq.POLLIN == 0:
                continue
            if thing is resp:
                print('0MQ messages received')
            elif thing is sigfd:
                print('child process terminated')

                # YUCK: 128 is a magic number, sizeof(struct signalfd_siginfo)
                # on the Linux box where I wrote this code (kernel 3.11.0,
                # x86_64). You'll need to write a small C program to determine
                # its value on your machine.
                data = os.read(sigfd, 128)
                assert len(data) == 128
                spawn_child()

That magic number—128 for sizeof(struct signalfd_siginfo)—is what's missing from the signalfd module. Someone really should submit a patch for that. Even better would be code to read() the struct and unpack it to a Python namedtuple.

Author: Greg Ward
Published on: Dec 18, 2013, 9:30:06 AM - Modified on: Aug 30, 2014, 3:38:01 PM
Permalink - Source code

Greg's Essential Doctor Who

The other day, I was chatting with a friend who mentioned that he was planning to use his upcoming paternity leave to catch up on his Doctor Who viewing ... and that he had seen none of the classic (1963-89) show!

This is not like catching up on Mad Men or Game of Thrones, folks. We're talking 26 years of episodic TV. Furthermore, unlike programs being made today with high production values, good acting, and excellent writing, the quality of Doctor Who over the years has been spotty. And, oh yeah, it's a children's show which for the first 10 years or so had no idea there might be adults watching. Finally, many episodes from those first 10 years are just gone: the BBC needed to free up shelf space or precious videotape or something, so wiped the tapes to reuse them. (No, they're not on YouTube.)

So unless you're able to devote hundreds and hundreds of hours to watching cardboard robots clamber about granite quarries while spaceships with visibly attached strings fly overhead, you need to be selective. And if you've never seen any of those old episodes, where will you start? Well, you could do worse than ask a friend who's been watching Doctor Who since he was a teenager, has seen most episodes that are still available (many of them twice) and has pretty strong opinions on most of them. That'd be me.

Credentials

I'm not British, nor have I ever lived in Britain. I watched Doctor Who on public TV in North America as a teenager, not as a child. I never hid behind the couch, and I often cringed at the godawful special effects. But I mostly loved the writing and the acting, even (especially!) at its most over-the-top. That's part of the fun.

So I don't have the same perspective on the show as millions of Brits do. I have the North American geek perspective. You have been warned.

Organization

The organziation of Doctor Who has always been a bit complex. The basic unit is an episode, typically 25 minutes, broadcast for decades on Saturdays around teatime. (Hey, it's a British show, so we get to use British units of time.) In the eighties they messed around with slightly longer episodes shown on different days, but one thing never changed: every episode was part of a longer serial. Most serials were four episodes long; some were six, a few were two, and one epic from the early days was twelve episodes long.

Each season (aka "series" in British TV-speak) consisted of around 5–8 serials. A couple of seasons (16, 23) shoehorned all their serials into a connected story arc.

Finally, the key to the show's longevity and inventiveness: every couple of years the main character "dies" and regenerates using a new actor, so it's obvious to group seasons by Doctor: first, second, third, etc.

(Incidentally, the episode/serial thing forced the writers to insert an arbitrary cliffhanger every 25 minutes or so; this rarely improved the narrative. It also means the first little bit of each episode recaps the cliffhanger from the previous episode, which is kind of annoying when you're watching episodes back-to-back, rather than a week apart as originally broadcast.)

Anyways, I'm grouping things by Doctor, and recommending serials (in boldface). There's little point in watching disconnected episodes, and watching whole seasons is rarely necessary (or even desirable). (There are a few negative recommendations snuck in there. They're also in boldface, so you actually have to read the text.)

First Doctor (William Hartnell) (1963–66)

In 1963, nobody had any idea that there were such things as Time Lords and TARDISes. The audience had to learn about them from scratch, in the very first story: An Unearthly Child. Unavoidably cheesy, in that the writing and acting were aimed squarely at children under 10—which is pretty much the case up until the Third Doctor. But essential viewing, if only for historical reasons.

The other thing nobody knew about in 1963 was Daleks, which is why the very second serial, The Daleks, is also important. In fact, I'd argue there are really only two Dalek stories you need to see, and this is one of them.

One feature of early Doctor Who, sadly abandoned many decades ago, is the "historical" story—i.e. one set in earth's past with no sf elements. The first of these was Marco Polo, but all that remains of that is audio and still photos. (Apparently the costumes and sets were quite impressive: the BBC always did have a flair for historical drama.) As second prize, perhaps try The Aztecs, the only historical serial from the first season to survive in its entirety.

Unfortunately I have nothing to say on the rest of the First Doctor; I've only seen a couple episodes that I remember, and they were pretty lame. I should specifically point out The Gunfighters as probably the worst of the early serials. Before this, you might have thought that the BBC was not meant to make Westerns; if you watch the whole thing, you will realize that instead the BBC was meant specifically to do anything, ANYTHING, other than make Westerns.

I'd like to say The Tenth Planet is essential on the grounds that it introduces the Cybermen (as long-running bad guys, second only to the Daleks) and that it features the first ever regeneration. But I've never seen it, and the fourth episode (with the regeneration scene) is lost. Boo.

Second Doctor (Patrick Troughton) (1966–69)

Sadly, I have even skimpier knowledge of Second Doctor stories than of the First Doctor era. Looking at the episode list on Wikipedia, that shouldn't be a surprise, as only a handful of Troughton serials have survived the ravages of time.

I do vaguely recall ploughing through the final Second Doctor serial, The War Games. It goes on and on for an interminable 10 episodes. Not very interesting, except that it's the first time we learn much about the Doctor's backstory, and it explains why the Third Doctor spends so much time on Earth.

Third Doctor (Jon Pertwee) (1970–74)

A number of good things happened when Doctor Who entered the seventies. Most obviously, the BBC splashed out and switched from black-and-white to colour. There are no more lost episodes (see Wikipedia for the tedious details of how various episodes were recovered). Finally, the writing and acting got a bit better. Not great, mind you, but watching Pertwee episodes feels more like entertainment and less like fanboy duty.

The Third Doctor's debut serial, Spearhead from Space, is worth watching because it introduces UNIT, a military outfit that defends the earth from alien invasion. (Why doesn't the real world have one of these?) We also meet UNIT's commander, Brigadier Lethbridge-Stewart, a longstanding and much-loved recurring character. This serial also introduces the Autons, baddies who were revived along with the show in 2005.

More importantly, Season 7 features the first serial that is great in its own right, not just as important background material: Inferno. It's a genuine nail-biter, with a mad scientist drilling STRAIGHT INTO THE EARTH's MANTLE OMG. No Balrogs were harmed in the filming of this serial, but it's a rip snorter all the same.

Another long-running character is introduced in Terror of the Autons, the first serial of Season 8. I speak, of course, of the Master. I don't remember this story terribly well, but the Master has been a foil to the Doctor for so long, in so many serials, and was played so deliciously well by Roger Delgado, that it's worth revisiting his debut.

Finishing up Season 8 is The Daemons, which exemplifies a minor trend in Doctor Who: giving a vaguely rational/scientific explanation for apparently supernatural phenomena. I've always liked this kind of story: they tiptoe close to the extreme suspension-of-disbelief required to enjoy horror, but end up making it sf in the end.

There's nothing memorable from Season 9, but the Season 10 finale features giant green glowing maggots. How can you go wrong with a serial called The Green Death?

Season 10 opens with The Time Warrior, a fun story that introduces a new monster (the Sontarans) and one of the show's best sidekicks (Sarah Jane Smith). I've always had a soft spot for serials that actually use the TARDIS to cut between different times and places, and this is one of them.

Fourth Doctor (Tom Baker) (1974–81)

Having skipped the Third Doctor's swansong, I suppose I should recommend the Fourth Doctor's debut, Robot. It's moderately entertaining, but hardly essential. But it's always fun to meet a new Doctor. This serial also introduces the dapper and genial Harry Sullivan, a companion for most of the next two seasons.

The one essential serial from Season 12 is also the second essential Dalek story: Genesis of the Daleks. This is Doctor Who at its best: trench warfare, genetic engineering, (ab)using time travel to meddle with history, a mad scientist who's also an evil dictator, a brutal ethical dilemma for the Doctor, and snapshots of real people caught up in horrific events. Come to think of it, all televised sf should strive for this.

Season 13 features another classic: Pyramids of Mars. Like The Daemons, this one offers a rational explanation for an apparently supernatural phenomenon, and the phenomenon in this case is a memorably venomous bad guy. Most take-over-the-universe stories are silly, but you can almost believe this particular bad guy pulling it off—and you would not want to live in the resulting universe. This serial also takes advantage of the TARDIS as part of the story, rather than simply a device for dumping the characters somewhere new at the start.

I have a soft spot for The Hand of Fear from Season 14. Although it's a typical rubber monster story, it's pretty entertaining. The last five minutes are important though: every so often we get a reminder that the Doctor is not like us, and this is one of them. If you find the story boring, that's OK, but at least watch the last five minutes of episode 4.

Those last five minutes lead directly into The Deadly Assassin, which is mandatory viewing. It's one of the rare stories set on Gallifrey, the Time Lords' planet. It features political intrigue rather than rubber monsters. And it continues to cast the Doctor in an interesting new light.

And from there it's straight into The Face of Evil (wow, I guess Season 14 was on a roll). Again, this is just plain good sf in its own right, entertaining and thought-provoking in equal measures. Plus it introduces a new companion, Leela, who 1) kicks ass, 2) rarely (if ever) screams, and 3) isn't half-bad looking.

Season 15 opened with Horror of Fang Rock, which eschewed both rubber monsters and take-over-the-universe plots, substituting psychodrama in a tight, confined historical setting. Oh yeah, there's an exploding spaceship too, but it's a minor element. And this serial demonstrates why Leela kicks ass.

The Invasion of Time closes Season 15. It's another Gallifrey story, so another building block in the Doctor's (back)story. I don't recall it being particularly great viewing, though.

Season 16 was Doctor Who's first attempt at a season-long story arc. The first serial, The Ribos Operation, introduces both the arc and the mysterious White and Black Guardians—Manichean figures locked in an eternal struggle for the fate of the universe, sort of Doctor Who's answer to the two sides of The Force. Oh yeah, it's also a fine episode in its own right: no mad scientists, nobody conquering the universe, just schemers trying to make a buck while the Doctor interferes.

The rest of Season 16 was pretty lame. The second serial, The Pirate Planet, is mildly interesting since it was written by Douglas Adams. It's not actually all that good, however.

From Season 17, City of Death is great viewing: it incorporates time travel, explains a hitherto-unknown historical mystery, and has some good acting to boot. There may have been a rubber monster threatening to take over the universe, but such things are OK in occasional doses.

I also quite liked Nightmare of Eden: imagine an Agatha Christie mystery set on a spaceliner in the far future, then add a Time Lord and TARDIS, some sort of weird transdimensional spaceship collision, and hallucinogenic drugs. Oddly enough, it all hangs together.

Fifth Doctor (Peter Davison) (1981–84)

Sixth Doctor (Colin Baker) (1985–86)

Seventh Doctor (Sylvestor McCoy) (1986–89)

Author: Greg Ward
Published on: Nov 29, 2013, 7:45:05 PM - Modified on: Nov 29, 2013, 10:09:00 PM
Permalink - Source code

Archiving historical PyCon web sites

In preparation for PyCon 2014, the organizers wanted to make static archives of the sites for past years. The 2011 site was already suffering from bitrot (the stylesheets had disappeared), and we wanted to grab the 2012 and 2013 sites before they too started to rot. Noah Kantrowitz was the instigator, and he suggested using httrack. I volunteered to help out, and settled on httrack by default. I used httrack 3.43.9, the version available in Debian 6.0 (squeeze), since that's what I'm running on my personal web server.

The initial mirror was easy:

mkdir /tmp/pycon && cd /tmp/pycon
httrack -w -o0 -K4 -c20 https://us.pycon.org/{2011,2012,2013}

where:

In order to get my web server to serve the mirrored content statically, I put them in a simple directory structure:

mkdir -p /var/www/pycon.gerg.ca mv us.pycon.org/{2011,2012,2013} /var/www/pycon.gerg.ca/. cd /var/www/pycon.gerg.ca

Of course, I also had to create a DNS record and configure my web server to serve that directory as pycon.gerg.ca.

In order to keep track of my changes, I turned each year into its own Mercurial repository:

cd 2011
hg init
hg add -q
hg commit -m"mirror of http://us.pycon.org/2011/, grabbed by httrack 3.43-9, ending 2013-06-19 12:47"

(and similar for 2012, 2013). (I could have put all three years into one big repository, but doing it this way seems more future-proof. At some point, we're going to want to archive 2013 and 2014 similarly.)

Now I can start finding and fixing problems. If a fix step goes horribly wrong, I can just hg revert the result and try again.

Unnecessary revision history

The 2011 and 2012 sites had remnants of revision history -- presumably a feature of Pinax? The static archive only needs to show the final revision of each page, so I nuked the revision history:

cd 2011
hg rm -I 're:.*/rev[0-9]+/' -I '**/history/*' .
hg commit -m"remove old revision history"

The interface for editing pages is useless, since it just redirects to a Django login page, which of course won't work in the static archive. Get rid of it:

hg rm -I '**/edit/*' .
hg ci -m"remove edit pages (they just redirect to the login page)"

Mystery login pages

All three sites had a bunch of mystery pages with paths like 2011/account/login/index0000.html. I'm guessing there were links from old revisions to those pages, which is why httrack captured them. Now that the old revisions are gone, make sure nothing left in the static site references them:

hg locate -0 | xargs -0 grep 'index[0-9a-f][0-9a-f][0-9a-f][0-9a-f]'

That found nothing, so remove them:

hg rm account/login/index????.html account/login/index????-?.html account/signup/index????.html
hg ci -m"remove mystery index????.html pages (unreferenced)"

Weird stylesheet names

Several .css files had weird URLs like /2012/site_media/static/css/pycon.css?10. That's OK in a dynamic site, but doesn't work so well with static filenames. (In fact, httrack mangled those URLs: references to them in HTML remained unchanged, but the files themselves turned out like pycond3d9.css -- apparently some sort of failed URL escaping going on there.) Regardless: it's broken, so I fixed it:

hg locate -0 -I '**.html' | xargs -0 perl -pi~ -e 's|(/201\d/site_media/static/css/.*.css)\?\d+|$1|g'
hg ci -m"fix stylesheet naming oddity"

Conclusion

Naturally, the precise sequence of fixups was slightly different for each of the PyCon sites that I captured (2011, 2012, and 2013). This blog post is a guideline and aide-memoire, not a tested, debugged, production-ready script. ;-)

Author: Greg Ward
Published on: Jun 21, 2013, 11:14:49 AM
Permalink - Source code

Unit-test your mail server with eximunit

Many years ago, I was one of the volunteer admins for starship.python.net and mail.python.org. I also ran my own personal email server for several years before giving in and switching to Gmail. In both capacities, I regularly tinkered with the configuration of Exim, the MTA (message transfer agent, aka email server) used on all of those machines. Every time I did so, I was a little bit nervous that my change might break the existing configuration. So I did a flurry of manual testing each time.

Now I'm trying to wean myself off Gmail and go back to running my own email server. I still like Exim, so figured I'd stick with what I know best. But there's still that little problem: how do I know my email server configuration is correct, and how do I keep it correct while changing it?

Of course the answer is obvious to a programmer: automated tests! It turns out that I'm not the only one who has faced this problem. Unlike me, David North actually did something about it and wrote eximunit. The idea is simple: it runs exim -bhc (fake SMTP conversation) in a subprocess and tests that the fake SMTP server behaves as expected. Bogus recipients are rejected, good recipients are accepted, relaying is denied, etc.

Here's an excerpt from the test script for my personal email server, which accepts email to domains gerg.ca and lists.gerg.ca, but nothing else:

class ExternalTests(eximunit.EximTestCase):
    def setUp(self):
        self.setDefaultFromIP("192.168.0.1")

    def test_no_relay(self):
        session = self.newSession()
        session.mailFrom("spammer@example.com")
        session.assertRcptToRejected("victim@example.net", "relay not permitted")

    def test_known_recipients(self):
        session = self.newSession()
        session.mailFrom("random@example.com")
        session.rcptTo("*CENSORED*@gerg.ca")
        session.rcptTo("*CENSORED*@gerg.ca")
        session.rcptTo("*CENSORED*@gerg.ca")

In that last test, I'm obviously censoring three different and valid email addresses -- don't want to make things too easy for the bad guys.

Anyways, the idea is fairly simple: you create a session (which is just a wrapper for exim -bhc in a child process) and call methods like mailFrom() or rcptTo(). Those versions assert that the command works, returning the expected 250 response code. For negative tests, you use methods like assertRcptToRejected(), which tests that the response code is a failure (550 in this case) and that the rejection message is what you expect.

Of course, this is not a full end-to-end test for an email server. It says so right there in the name: eximunit. If you want functional testing or integration testing, perhaps you want something like Swaks? (I have not tried it.)

Author: Greg Ward
Published on: Apr 24, 2013, 2:51:53 PM
Permalink - Source code

A brief history of Python Distutils

When I started using Python in September 1998, I pretty quickly noticed there was a problem in the ecosystem: every library that included an extension module had its own little Makefile that cribbed from Python's own Makefile, which was (and still is) installed alongside the standard library. Libraries that did not include extensions generally had a README file that said, "copy foo.py to a directory on sys.path" and left it at that. The audience was pretty clearly fellow Python developers who wanted to use the library and knew exactly what sys.path was. Worst, anyone wanting to build extension modules on Windows was on their own. This was no secret; everyone in the community at the time knew it was a problem, but everybody was too busy with their own stuff to tackle it. (Or they had the sense to stay well away from it.)

So, in a nutshell, I started the distutils project. Well, OK, really I scheduled a session at the 1998 International Python Conference (the precursor to PyCon) called "Building Extensions Considered Painful", with the goal of figuring out what we would do about it. A number of people much smarter and more experienced than me were there, but I only remember Greg Stein and Barry Warsaw. My recollection of what happened in that conference session was:

What I don't recall is how I got roped into writing most of the code. It's entirely possible that I volunteered to do it and nobody stopped me.

About those unit tests

I gather that the lack of unit tests is a frequent complaint for anyone who tries hacking on the distutils. I thought the reason for this was perfectly obvious until I had to explain it to Nick Coghlan in person at PyCon 2013. It's quite simple: distutils predates unit testing. Or at least, unit testing in Python. I'm sure Kent Beck had something working in Smalltalk by then, but I was quite unaware of it.

Anyways, it's pretty clear by looking at history:

However, that does not excuse the lack of automated functional tests. For that, I have no excuse. I vaguely recall either 1) I didn't know how to do it, or 2) I didn't want to write a dedicated testing framework just for distutils. If I was doing it today, I could probably figure something out. But I know now more about programming in general, and about automated testing in particular, than I did then.

Author: Greg Ward
Published on: Mar 28, 2013, 6:08:42 PM
Permalink - Source code

PyCon 2013: What a rush!

I like to tell people that my first PyCon was so long ago, it wasn't even PyCon. Back then, the community's big gathering was called the International Python Conference, and my first one was Houston in November 1998. That's surprisingly relevant to a big theme of PyCon 2013 -- more on that later. Despite a good string of attendance at IPCs from 1998 to 2001, this year was only my second PyCon (after Atlanta in 2011). Nutshell version: holy cow what an intense/exhilarating/crazy/overdrive experience.

Volunteering

For some reason it never occurred to me to volunteer at a conference before. Especially considering that PyCon is almost entirely run by volunteers, I now have to wonder: what was I thinking? This year I noticed some tweets and emails encouraging people to volunteer, so I did it. Turns out that volunteering at PyCon is how you peek behind the curtain and nudge closer to the inner circle without having to do all that much. It's a brilliant scam, but it seems to work because there are a lot of people willing to do a bit, and some people willing to do a lot.

The fun part was helping at the registration desk. I literally got off the plane, took a taxi to the conference centre (with a gaggle of other Montrealers who happened to be on the same plane), registered and got my badge, and went right behind the desk to register other people. Oh yeah, there was a 5-minute training session in there somewhere, so I had a vague idea what I was doing.

The work part was being a session runner. This involves being in the Green Room well before a given talk starts, making sure the speaker is there, finding them if not, making sure their slides are saved, making sure they can hook their laptop up to the projector, getting them to the right room at the right time, and then leaving their interesting talk halfway through to start the process again for the next talk. I managed to be late for my first session because I just plain forgot, which was a minor panic. Then, because irony exists as an actual physical force in the universe, I was late for my second session because I was enjoying a leisurely lunchtime chat with Mathieu Leduc-Hamel and Yannick Gingras about how great it is to volunteer at PyCon. Argghh. Ah well, it all came together in the end.

Good talks

As usual, there was way too much interesting stuff on hand to see everything I wanted to see. Really, something must be done about that next year. Or we'll just have to stick with recording everything so you can catch what you missed later on at home.

Here are the talks that I saw in person and recommend:

There are a bunch more on my list of talks to watch on my laptop when too tired for programming but not tired enough to completely collapse:

... and I just got tired of copying and pasting, despite being only halfway through the list. You get the picture: there were way more interesting-sounding talks than any one person could attend in person. (And if you view a talk from home that turns out to be boring, you can walk away without hurting anyone's feelings!)

Packaging

I mentioned that there was a secret hidden connection between my first International Python Conference in 1998, and PyCon 2013: packaging. Well, really, I should say build and packaging tools, since the problem back in 1998 was building, but the problem today is packaging.

The whole story is a bit long, so I'll write it up in a separate post. TL;DR: the basic design of distutils came out of the 1998 conference, and I wrote most of the code from late 1998 to late 2000. Standard build tool for Python libraries: problem solved!

But we punted on a couple of key issues, specifically dependencies and packaging. My opinion at the time (1999) was that packaging was a solved problem: it's called Debian, or Red Hat if you prefer. Both have excellent packaging systems that already worked fine, and I felt it was silly to reinvent that wheel for one particular programming language. If you need library X in your production environment, then you build a package of it using distutils. When Harry Gebel contributed the bdist_rpm command to build a simple RPM from information in your setup script, that was a key step. We just made sure it was possible to build only the .spec file, because of course you would often need to tweak that manually. Then I waited for someone to contribute bdist_deb, but it didn't happen.

It turns out, unfortunately, that people persist in using deficient operating systems with no built-in packaging system. More importantly, programmers like to use the latest and greatest version of library X, which is incompatible with OS vendors wanting to stick with stable known versions. And building OS packages is a pain, especially when you're in development and playing around with libraries.

Unfortunately, I burned out and lost the energy to keep working on distutils before any of this became apparent. So over the years, various people have tried to address these problems in various incompatible ways. I've pretty much ignored things, because after all packaging was a solved problem for me (use OS packages, or build your own).

Anyways, it appears that 2013 is the year the Python community has decided that enough is enough, and we need One True Packaging Solution. As a result, there were a couple of packaging-related sessions at PyCon this year. Probably the most important development is that Nick Coghlan has volunteered (was volunteered?) to be the czar of all things packaging. I wish him well. Heck, maybe I'll even contribute!

Hallway track

I chatted with a number of people, some long-time acquaintances and some total strangers. I mentioned to several of them that I had been using Go for a couple of months, and every single one of them was quite curious about it. So I definitely need to write up this Python hacker's view of Go. (Nutshell version: largely positive, but I miss exceptions.)

I was also fairly shameless about looking for a job, which I've been in the midst of for the past week or two. One piece of advice: PyCon is an excellent place for programmers to look for work; the job fair reminded me of a two-way feeding frenzy. Another piece of advice: don't fire off a bunch of job application emails right before going to a conference. You'll spend way too much time hunched over your smartphone trying to type professional-sounding but still short email messages explaining that you can't really talk right now, but next week would be great. Maybe I should have bought one of those spiffy ultra-light laptops so that whipping a real computer out for processing email isn't so painful. Oh well.

Open spaces & sprints

I scheduled two open spaces: a Mercurial clinic and a place to debate the wisdom of extendding vs. embedding.

I was late for the Mercurial clinic because of another really interesting open space; many thanks to Augie Fackler for showing up on time and fielding most of the questions. Oooops. We fielded some basic questions like, "do I really have to merge this often?", explaining why it works that way, what the alternatives are, and so forth. The usual stuff for someone new to DVCS.

Augie also explained changeset evolution to a couple of people. Evolution is a very cool feature Mercurial has been growing slowly over the past couple of releases to allow safe mutable history. As of Mercurial 2.5, it's good enough that Mercurial's developers are eating their own dogfood and using it to develop Mercurial. I've also been using it heavily on personal projects. Hopefully it will be solid enough by 2.6 that enthusiastic power users can start to use it instead of MQ.

After the Mercurial Clinic, Augie and I popped into the Python 3 Porting Clinic organized by Barry Warsaw. That's when the real fun began. Augie has been plugging bravely away at porting Mercurial to Python 3, and all is not well. Barry opened issue 17445 as a result of Augie's first showstopper bug, and I've been working on a patch. No doubt more showstoppers will follow. I stayed for the first day of sprinting, and spent the day alternating between fixing that bug and whining to people about the various unpleasant ways of fixing it. I got to meet several thoroughly decent people in the process: Richard Jones, Nick Coghlan, Toshio Kuratomi, and Buck ... umm, sorry Buck, I didn't catch your last name.

My other open space was intended to spark a little debate about extending vs. embedding. I've been working on a build tool that revolves around embedding "real" languages like Python, and suffered a minor crisis of confidence when I stumbled across Glyph Lefkowitz's anti-embedding rant from 2003. So I was hoping this session would clear the air a bit. It didn't. I guess I should have invited Glyph personally. My personal suspicion is that there do exist valid use cases for embedding, but you had better make damn sure it's the right choice. "When in doubt, choose extending" is not as strong as Glyph's stance, but there you go. It's my opinion today.

Wrap up

Everybody knows that the language is great, and that the library is (mostly) great too. Likewise, the collection of modules on PyPI is a good thing (if a bit overwhelming).

But honestly, it's the people that make hacking in/on/with Python so fun and rewarding. I know there is a stereotype of programmers as anti-social, maladjusted, unpleasant people, but that stereotype just vanishes at PyCon. Whether chatting with total strangers who I'll probably never meet again, or reconnecting with former colleagues, or meeting people who I've only "known" online before, PyCon is just about the friendliest and most welcoming environment I've ever been in. That's just as true with 2,500 people in 2013 as it was with 250 in 1998.

Author: Greg Ward
Published on: Mar 28, 2013, 6:08:42 PM
Permalink - Source code

Command-Line Parsing Libraries for Go

I'm working on a project written in Go, and I've gotten to the point where I need to pick a command-line parsing library. I'm not fond of the standard flag package for a couple of reasons:

So I thought I'd shop around and see what's out there. What follows are my impressions of all the command-line parsing libraries listed in the Go Project Dashboard.

Disclaimer: I'm the original author of the Python standard library module optparse, so I'm biased in favour of convenient, automated conversion of command-line options to programmer-accessible variables. I also like automatic generation of nice-looking help. And I prefer GNU style:

Executive summary: I evaluated 8 of the libraries listed in the Go Project Dashboard. Three of them look promising:

The remainder appear to be incomplete and/or abandoned (eg. they don't build, are undocumented, or have no unit tests).

All evaluations were done with Go 1.0.3 using the latest version of each package as of December 18, 2012.

launchpad.net/gnuflag

Claims to be a compatible with the standard flag package, but supports GNU/POSIX syntax instead.

Pros:

  • mostly works as advertised: implements some of GNU/POSIX syntax
  • documented
  • tested
  • compiles and works

Cons:

  • does not conform to current "go build" rules (but the fix is trivial)
  • no support for aliases: -q and --quiet are entirely different options, and will get separate entries in help output
  • does not support option clustering (-ab equivalent to -a -b)
  • similarly, does not treat -abfoo as equivalent to -a -b foo
  • does not allow abbreviation of long options
  • default help output is ugly (but gnuflag provides an easy hook to override its help output)

code.google.com/p/goargcfg

Undocumented, so I did not evaluate it.

github.com/droundy/goopt

Like gnuflag, claims to be compatible with flag but support GNU/POSIX syntax. I could not verify this for myself, since it does not build.

Pros:

  • documented
  • comes with an example program

Cons:

  • does not build with "go build" or with the supplied Makefile (appears to predate "go build")
  • no unit tests (there is a test program, but it doesn't build either)

github.com/gaal/go-options

Yet another implementation of GNU/POSIX syntax, but using a totally different API from flag. You write a big string that describes all of your options, and the library parses it at runtime and then parses the command line. Panics if your specification is malformed.

Pros:

  • mostly works as advertised: implements some of GNU/POSIX syntax
  • nicely documented
  • tested
  • easy-to-use API
  • generates decent-looking help
  • compiles and works (modulo limitations below)

Cons:

  • does not support option clustering (-ab equivalent to -a -b)
  • similarly, does not treat -abfoo as equivalent to -a -b foo
  • sloppy syntax: treats -verbose equivalent to --verbose
  • some people might dislike the dynamic API and prefer compile-time checking (doesn't bother me)

github.com/fd/options

Similar idea to go-options, but the documentation is very thin. As a result, I was unable to get it to work.

Cons:

  • insufficient documentation
  • does not work (at least not for me)
  • generated help is not very helpful

code.google.com/p/optparse-go

Appears to be based on my Python module optparse, so I have an inherent bias in favour of this one. Unfortunately, it doesn't build.

Pros:

  • should implement all of GNU/POSIX syntax, since optparse does
  • should support option aliases, since optparse does
  • should support abbreviation of long options, since optparse does

(the above are presumptions, not verified by reading the code or trying it)

Cons:

  • not compatible with "go build" (mixed packages in same directory)
  • uses lots of obsolete library packages (needs "go fix")
  • even then, it still doesn't compile
  • no documentation

code.google.com/p/opts-go

Pros:

  • some documentation

Cons:

  • documentation doesn't give high-level overview; you just have to figure it out
  • no unit tests (there are two *_test.go files, but they test almost nothing)
  • does not treat --file foo equivalent to --file=foo
  • doesn't support abbreviation of long options
  • help generation code is incomplete -- doesn't actually generate help

github.com/ogier/pflag

Yet another implementation of GNU/POSIX syntax using the same API as flag.

Pros:

  • documented
  • unit tests
  • supports long and short options
  • supports option clustering
  • treats -abfoo equivalent to -a -bfoo

Cons:

  • not compatible with "go build" / "go test" (but the fix is trivial)
  • no support for abbreviating long options
  • does not treat --file foo equivalent to --file=foo
Author: Greg Ward
Published on: Dec 23, 2012, 1:20:13 AM
Permalink - Source code