Random notes from mg

a blog by Marius Gedminas

Marius is a Python hacker. He works for Programmers of Vilnius, a small Python/Zope 3 startup. He has a personal home page at http://gedmin.as. His email is marius@gedmin.as. He does not like spam, but is not afraid of it.

Sat, 13 Mar 2010

Review: Python Testing: Beginner's Guide

I've been testing (as well as writing) Python code for the last eight years, so a book with the words Begginer's Guide prominently displayed on the cover isn't something I'd've decided to buy for myself. Nevertheless I jumped at the offer of receiving a free e-copy for reviewing it.

Python Testing: Beginner's Guide by Danien Arbuckle

Short summary: it's good book. I learned a thing or two from it. I don't know well it would work as an introductionary text for someone new to unit testing (or Python). Some of the bits seemed overcomplicated and underexplained, parts of the example code/tests seemed to contain design decisions received from mysterious sources.

Incidentally, Packt uses a simple yet effective method for watermarking e-books: my name and street address are displayed in the footer of every page. What's funny is that the two non-ASCII characters in the street name are replaced with question marks. It's not a data entry problem: the website that let me download those books shows my address correctly, so it must be happening somewhere in the PDF production process. I didn't expect this kind of Unicode buggyness from a publisher. Then again there were occasional strange little typographical errors in the text, like not leaving a space in front of an opening parenthesis in an English sentence, or using a never-seen-before +q= operator in Python code. I was also left wondering how the following sentence (page 225) could slip past the editing process:

doctest ignores everything between the Traceback (most recent last call).

Thankfully those small mistakes did not detract from the overall message of the book.

I liked the author's technique of showing subtly incorrect code, letting the reader look at it and miss all the bugs, and then showing how unit or integration tests catch the bugs the reader missed. I'm pretty sure there's at least one remaining bug that the author missed in the example package (storing a schedule doesn't erase old data), which could serve for a new chapter on regression testing if there's a second edition.

Summary of topics covered:

I found the TDD cycle a bit larger than I generally like, but I believe it's a matter of taste, and perhaps a shorter cycle wouldn't work as well in a written medium.

I found it a bit jarring how the Twill chapter intrudes between the two chapters showing unit testing and integration testing of the same sample package. I think it would've been better to swap the order of chapters 8 and 9.

I liked the technique presented for picking subsets of the code for integration tests, although I wonder how well it would work on a larger project.

Topics not covered:

As you can see these holes are all rather small.

Probably the biggest weakness of the book is the complexity of some things shown:

Seeing the repetitive and redundant mock code in the first few doctest examples I started asking what's the point?, but the book failed to provide a compelling answer (the answer provided—it's easier to locate bugs—works just as well for integration tests that focus on individual classes). And there are good answers for that question, like instant feedback from your unit test suite. Are they worth the additional development effort? Maybe that depends on the developer. I don't think they would help me, so I tend to stick with low-level integration tests I call "unit tests" (as well as system tests; it's always a mistake to keep all your tests in a single level). I'm slightly worried that this book might give the wrong impression (testing is hard) and turn away beginning Python programmers from writing tests altogether.

Overall I do not feel that I have wasted my time reading Python Testing. I look forward to reading the other reviews that showed up on Planet Python. I gathered that not all reviewers were happy with the book, but avoided reading their reviews in order not to influence my own.

posted at 21:54 | tags: , | permanent link to this entry | 2 comments

Sat, 06 Mar 2010

You've got to love profiling

Yesterday I slashed 50% of run time from our applications functional test suite by modifying a single function. I had no idea that function was responsible for 50% of the run time until I started profiling.

Profiling a Python program is getting easier and easier:

$ python -m cProfile -o prof.data bin/test -f

runs our test runner (which is a Python script) under the profiler and stores the results in prof.data.

$ runsnake prof.data

launches the RunSnakeRun profile viewer, which displays the results visually:

RunSnakeRun square map display
The square map display of RunSnakeRun, with the 'render_restructured_text' function highlighted.

Who knew that ReStructuredText rendering could be such a time waster? A short caching decorator and the test suite is twice as fast. The whole exercise took me less than an hour. I should've done it sooner.

Other neat tools:

posted at 20:49 | tags: | permanent link to this entry | 6 comments

Fri, 05 Mar 2010

Bye, bye, free time!

Things I've taken up to do in the nearest future:

I really ought to read Getting Things Done. Reading it has been on my todo-list for years.

posted at 23:02 | tags: | permanent link to this entry | 4 comments

Wed, 03 Mar 2010

Weekly Zope developer IRC meetings

On Tuesday we started what will hopefully become a tradition: weekly IRC meetings for Zope developers. Topics covered include buildbot organization and maintenance, open issues with the ZTK development process, and the fate of Zope 3.5 (= BlueBream 1.0).

There are IRC logs of the meeting, and Christian Theune posted a summary to the mailing list.

My take on this can be summed up as: Zope ain't dead yet! The project has fragmented a bit (Zope 2, Zope Toolkit, Grok, BlueBream, Repoze), but we all share a set of core packages and we want to keep them healthy.

Next meeting is also happening on a Tuesday, at 15:00 UTC on #zope in FreeNode.

posted at 13:09 | tags: , | permanent link to this entry | 0 comments

Thu, 07 Jan 2010

Latin-1 or Windows-1252?

Michael Foord wrote about some Latin-1 control character fun in a blog that's hard to read (the RSS feed syndicated on Planet Python is truncated, grr!) and hard to reply (no comments on the blog! my Chromium's AdBlock+ hid the comment link so I couldn't find it), but never mind that.

Unfortunately the data from the customers included some \x85 characters, which were breaking the CSV parsing.

0x85 is a control character (NEXT LINE or NEL) in Latin-1, but it's a printable character (HORIZONTAL ELLIPSIS) in Microsoft's code page 1252, which is often mistaken for Latin-1. I would venture a suggestion that the encoding of the customer data was not latin-1 but rather cp1252.

>>> '\x85'.decode('cp1252')
u'\u2026'
posted at 23:29 | tags: | permanent link to this entry | 3 comments

Fri, 18 Dec 2009

GTimeLog: not dead yet!

Back in 2004 I wrote a small Gtk+ app to help me keep track of my time, and called it GTimeLog. I shared it with my coworkers, put it on the web (on the general "release early, release often" principles), and it got sort-of popular before I found the time to polish it into a state where I wouldn't be ashamed to show it to other people.

Fast-forward to 2008: there are actual users out there (much to my surprise), I still haven't added the originally-envisioned spit and polish, haven't done anything to foster a development community, am wracked by guilt of not doing my maintainerly duties properly, which leads to depression and burnout. So I do the only thing I can think of: run away from the project and basically ignore its existence for a year. Unreviewed patches accumulate in my inbox.

It seems that the sabbatical helped: yesterday, triggered by a new Debian bug report, I sat down, fixed the bug, implemented a feature, applied a couple of patches languishing in the bug tracker, and released version 0.3 (which was totally broken thanks to setuptools magic that suddenly stopped working; so released 0.3.1 just now). Then went through my old unread email, created bugs in Launchpad and sent replies to everyone. Except Pierre-Luc Beaudoin, since his @collabora.co.uk email address bounced. If anyone knows how to contact him, I'd appreciate a note.

version is now shown in the about dialog

There are also some older changes that I made before I emerged out of the funk and so hadn't widely announced:

posted at 01:22 | tags: , | permanent link to this entry | 5 comments

Wed, 09 Dec 2009

Unix is an IDE, or my Vim plugins

Unix is an IDE. I do my development (Python web apps mostly) with Vim with a bunch of custom plugins, shell (in GNOME Terminal: tabs rule!), GNU make, ctags, find + grep, svn/bzr/hg/git.

The current working directory is my project configuration/state. I run tests here (bin/test), I search for code here (vim -t TagName, find + grep), I run applications here (make run or bin/appname). I can multitask freely, for example, if I'm in the middle of typing an SVN commit message, I can hit Ctrl+Shit+T, get a new terminal tab in the same working directory, and look something up. No aliases/environment variables/symlinks/scripts making changes to config files. I can work on multiple projects at the same time. I can work remotely (over ssh).

Gary Bernhardt's screencasts on Vimeo show how productive you can get if you learn Vim and tailor it to your needs. I have Vim scripts that let me

Some of these come from www.vim.org, some I've written myself, some I've taken and modified a little bit to avoid an irritating quirk or add a missing feature. Some things I don't have (and envy Emacs or IDE users for having -- like an integrated debugger for Python apps, and, generally, integration with other tools, running in the background).

It's been my plan for a long time to polish my plugins, release them somewhere (github? bitbucket? launchpad?) and upload to vim.org, but as it doesn't seem to be happening, I thought I'd at least put an svn export of my ~/.vim on the web.

posted at 01:23 | tags: , , | permanent link to this entry | 5 comments

Tue, 01 Dec 2009

Displaying multiline text in Zope 3

zope.schema has Text and TextLine. The former is for multiline text, the latter is for a single line, as the name suggests. Zope 3 forms will use a text area for Text fields and an input box for TextLine fields. Display widgets, however, apply no special formatting (other than HTML-quoting of characters like <, > and &), and since newlines are treated the same way as spaces in HTML, your multiline text gets collapsed into a single paragraph.

Here's a pattern I've been using in Zope 3 to display multiline user-entered text as several paragraphs:

import cgi

from zope.component import adapts
from zope.publisher.browser import BrowserView
from zope.publisher.interfaces import IRequest


class SplitToParagraphsView(BrowserView):
    """Splits a string into paragraphs via newlines."""

    adapts(None, IRequest)

    def paragraphs(self):
        if self.context is None:
            return []
        return filter(None, [s.strip() for s in self.context.splitlines()])

    def __call__(self):
        return "".join('<p>%s</p>\n' % cgi.escape(p)
                        for p in self.paragraphs())

View registration

<configure
    xmlns="http://namespaces.zope.org/zope">

  <view
      for="*"
      name="paragraphs"
      type="zope.publisher.interfaces.browser.IBrowserRequest"
      factory=".views.SplitToParagraphsView"
      permission="zope.Public"
      />

</configure>

and usage

<p tal:replace="structure object/attribute/@@paragraphs" />

Update: The view really ought to be registered twice: once for basestring and once for NoneType. I was too lazy to figure out the dotted names for those (or check if zope.interface has external interface declarations for them), so I registered it for "*". You should know that this makes the view available for arbitrary objects (but won't work for most of them, since they don't have a splitlines method), and that it is, sadly, accessible to users who may try to hack your system by typing things like @@paragraphs in the browser's address bar. Ignas Mikalajūnas offers an alternative solution using TALES path adapters.

posted at 20:52 | tags: , | permanent link to this entry | 1 comments

Mon, 21 Sep 2009

Pylons and SQL schema migration

I'm at the point in my hobby project where I'd like to be able to change my models without losing all my test data. And I'm too lazy to do manual dumps and edit the SQL in place before reimporting it.

I want a system

I've been glancing at SQLAlchemy-Migrate, since I've been brought up to believe NIHing is Bad. But Migrate is scary. I have to admit that the longer I stare at its documentation, the less I can describe why I think so. All those shell commands—but there's an API for invoking them from Python, so maybe I can achieve my goals. I'll have to try and see.

posted at 19:44 | tags: , , | permanent link to this entry | 9 comments

Tue, 15 Sep 2009

Pylons with zc.buildout, continued

Last time I mentioned that running bin/buildout with the -N flag makes it run faster (since it skips looking for newer versions to upgrade). You can tell buildout to do this by default by putting 'newest = false' into the [buildout] section of buildout.cfg. We'll be running bin/buildout a lot now, since we'll be making changes to the project environment, so this will save wear and tear on the '-', 'N' and Shift keys. (And, by the way, I'm not trying to soak up Google juice by repeating the word 'buildout' a lot, honest!)

I will omit bzr commits from this narrative as it's getting long; you can assume that every self-contained change was committed separately.

tests

First, I want a bin/test script to run the test suite. Pylons uses nose, so we need to tell buildout to install the nosetests script (under a different name, since I'm used to typing bin/test no matter what test runner a project happens to use):

$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-15 19:49:11 +0000
+++ buildout.cfg	2009-09-15 19:49:18 +0000
@@ -8,5 +8,8 @@
 recipe = zc.recipe.egg
 eggs = Pylons
        PasteScript
+       nose
        asharing
 interpreter = python
+scripts = paster
+          nosetests=test

$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/paster'.
Generated script '/tmp/AlliterationSharing/bin/test'.
...
$ bin/test

----------------------------------------------------------------------
Ran 0 tests in 0.276s

OK

ctags

Documentation is good, but sometimes you want to look at the source code of the framework. There's a tool called ctags that builds a database of identifiers. The popular text editors Vim and Emacs can then use the tags database to jump to a definition of any name with a single keystroke (Ctrl-] in vim, M-. in emacs).

Building the tags database is complicated by each Python package being installed into a separate directory. There's a buildout recipe called z3c.recipe.tag that finds those directories and lets you build a unified tags file. We'll also ask buildout to make sure it unzips any packages distributed as .egg files, since ctags doesn't process those:

$ bzr diff
@@ -1,8 +1,9 @@
 [buildout]
 develop = .
-parts = pylons
+parts = pylons ctags
 
 newest = false
+unzip = true
 
 [pylons]
 recipe = zc.recipe.egg
@@ -13,3 +14,7 @@
 interpreter = python
 scripts = paster
           nosetests=test
+
+[ctags]
+recipe = z3c.recipe.tag:tags
+eggs = ${pylons:eggs}

$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/ctags'.
...
$ bin/ctags

omelette

ctags lets you find classes and functions by name; it doesn't let you find packages or modules. There's another recipe, collective.recipe.omelette that creates a tree of symlinks mirroring the Python package structure (here 'unzip = true' also comes in handy):

$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-15 20:04:42 +0000
+++ buildout.cfg	2009-09-15 20:05:30 +0000
@@ -1,6 +1,6 @@
 [buildout]
 develop = .
-parts = pylons ctags
+parts = pylons ctags omelette
 
 newest = false
 unzip = true
@@ -18,3 +18,7 @@
 [ctags]
 recipe = z3c.recipe.tag:tags
 eggs = ${pylons:eggs}
+
+[omelette]
+recipe = collective.recipe.omelette
+eggs = ${pylons:eggs}

$ bin/buildout 
...
$ ls -l parts/omelette
...

The symlink tree is created under parts/omelette/. For example, if you want to see what webhelper tags were available, you can open parts/omelette/webhelper/html/builder.py in your editor and see.

Makefile

This is getting long (and not everyone may be interested1), but one long post is easier to skip than five medium ones in a row, so I'll continue.

1 Sorry, Planet Maemo! There's an RSS feed of posts tagged 'maemo', if you can figure out the URL, which is very well hidden by PyBlosxom, *sigh*.

Wouldn't it be nice if new developers could check out your project and start it up with just a couple of commands? Make is a time-tested tool that works well for this:

$ cat Makefile
# Just remember that you need to use real tabs, not spaces, in a Makefile

PYTHON = python

.PHONY: all
all: bin/paster

.PHONY: run
run: bin/paster
        bin/paster serve development.ini --reload

.PHONY: test check
test check: bin/test
        bin/test

.PHONY: tags
tags: bin/ctags
        bin/ctags

bin/paster bin/test bin/python bin/ctags: bin/buildout
        bin/buildout

bin/buildout: bootstrap.py
        $(PYTHON) bootstrap.py

Now all you need to do after checking out is run 'make' to set up a working development environment. 'make run' or 'make test' will also do that, if necessary, so this one-liner is sufficient to get a working Hello World application on port 5000:

$ bzr branch lp:~mgedmin/+junk/AlliterationSharing && cd AlliterationSharing && make run

Try it! You'll get a Bazaar branch with all the history of this little blog project.

posted at 23:31 | tags: , , | permanent link to this entry | 7 comments

Sun, 13 Sep 2009

Starting a Pylons project with zc.buildout

For software development I prefer buildout to virtualenv. This is because buildout has a text file describing the state of your working environent, which can be versioned and used later to recreate it, as well as during development to modify the environment slightly.

To start a new Pylons project, first create an empty directory. Let's call our new project AlliterationSharing1, because everybody is sick of 'foo' and 'bar'.

1 Generated by randomly picking two words from /usr/share/dict/words, then chosen over among 120 other variants that weren't as good.

$ mkdir -p ~/src/AlliterationSharing
$ cd ~/src/AlliterationSharing

Now create a file called buildout.cfg with the following content:

$ cat buildout.cfg
[buildout]
parts = pylons

[pylons]
recipe = zc.recipe.egg
eggs = Pylons
       PasteScript
interpreter = python

Download bootstrap.py to it and run it to get bin/buildout. Note: you can chose which Python version you want to use by running bootstrap.py with it. All other scripts under bin/ will be generated by buildout and will use the same Python interpreter.

$ wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py
$ python bootstrap.py
Creating directory '.../AlliterationSharing/bin'.
Creating directory '.../AlliterationSharing/parts'.
Creating directory '.../AlliterationSharing/eggs'.
Creating directory '.../AlliterationSharing/develop-eggs'.
Generated script '.../AlliterationSharing/bin/buildout'.

Run bin/buildout to install Pylons into your sandbox.

$ bin/buildout
Installing pylons.
Generated script '.../AlliterationSharing/bin/paster'.
Generated interpreter '.../AlliterationSharing/bin/python'.

Aside: buildout has this very nice feature where it can share Python packages between projects. This will save you enormous amounts of time that would otherwise be spent downloading and unpacking eggs. To make use of this facility, create a file ~/.buildout/default.cfg with

$ cat ~/.buildout/default.cfg 
[buildout]
eggs-directory = /home/mg/tmp/buildout-eggs
# XXX replace /home/mg with the full path of *your* home directory
# it would be much nicer if buildout let me use ~ or $HOME
# see https://bugs.launchpad.net/zc.buildout/+bug/190260

Another useful trick is to pass the -N flag to bin/buildout, which will tell it not to bother looking for newer versions of packages on the Internet when there's already an existing version installed in your eggs directory.

Back to business: now you've got two new scripts: bin/python and bin/paster. You can use the first one to play with the interactive Python console where you can now import pylons and all the dependencies; it has no other value.

Now is a good point to add the files you've created into a version control system. I'll arbitrarily use Bazaar.

$ bzr init .
$ bzr add bootstrap.py buildout.cfg
$ bzr ignore bin parts eggs develop-eggs .installed.cfg
$ bzr commit -m "Create AlliterationSharing project"

Run bin/paster create -t pylons to create a skeleton project.

$ bin/paster create -t pylons asharing
$ bzr ignore *.egg-info
$ bzr add asharing
$ bzr commit -m "Generated project files with paster create"

Now paster creates a directory structure that I don't like:

AlliterationSharing/
  buildout.cfg
  bin/
  asharing/
    setup.py
    README.txt
    MANIFEST.in
    asharing/
      __init__.py
      config/
      controllers/
      templates/
      public/

I'd like the README and setup.py to be in the top level, and I dislike repeating 'asharing' twice in directory names. I'll move some files around

$ cd asharing/
$ bzr mv development.ini docs MANIFEST.in README.txt setup.* test.ini ../
$ bzr rm ez_setup.*
$ cd ..
$ bzr mv asharing src
$ bzr ci -m "Moved some files around"

Now the tree looks like this:

AlliterationSharing/
  buildout.cfg
  setup.py
  README.txt
  MANIFEST.in
  bin/
  src/
    asharing/
      __init__.py
      config/
      controllers/
      templates/
      public/

We have to tell setup.py where to find the source tree

$ bzr diff
=== modified file 'MANIFEST.in'
--- MANIFEST.in	2009-09-13 13:04:00 +0000
+++ MANIFEST.in	2009-09-13 13:05:59 +0000
@@ -1,3 +1,3 @@
-include asharing/config/deployment.ini_tmpl
-recursive-include asharing/public *
-recursive-include asharing/templates *
+include src/asharing/config/deployment.ini_tmpl
+recursive-include src/asharing/public *
+recursive-include src/asharing/templates *

=== modified file 'setup.py'
--- setup.py	2009-09-13 13:04:00 +0000
+++ setup.py	2009-09-13 13:04:40 +0000
@@ -17,7 +17,8 @@
         "SQLAlchemy>=0.5",
     ],
     setup_requires=["PasteScript>=1.6.3"],
-    packages=find_packages(exclude=['ez_setup']),
+    packages=find_packages('src', exclude=['ez_setup']),
+    package_dir={'': 'src'},
     include_package_data=True,
     test_suite='nose.collector',
     package_data={'asharing': ['i18n/*/LC_MESSAGES/*.mo']},

(I'm not sure if you also need to change package_data and/or setup.cfg; it's possible that I left i18n in a broken state. Can somebody comment on this?)

And we have to tell buildout that we've got a new Python package to enable in the project environment

$ bzr diff buildout.cfg 
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-13 12:57:21 +0000
+++ buildout.cfg	2009-09-13 13:08:05 +0000
@@ -1,8 +1,10 @@
 [buildout]
+develop = .
 parts = pylons
 
 [pylons]
 recipe = zc.recipe.egg
 eggs = Pylons
        PasteScript
+       asharing
 interpreter = python

Now you can re-run bin/buildout and start your hello-world project

$ bzr commit -m "Include the new package in the build"
$ bin/buildout -N
$ bin/paster serve --reload development.ini

Happy hacking!

To be continued: telling buildbot to create bin/test; using ctags and omelette.

posted at 16:13 | tags: , , | permanent link to this entry | 17 comments

Mon, 03 Aug 2009

Local changes to buildout.cfg

Most of Python packages in the Zope world use Buildout:

svn co svn+ssh://svn.zope.org/repos/main/plone.z3cform/trunk plone.z3cform
cd plone.z3cform
python2.4 bootstrap.py
bin/buildout
bin/test -pvc

Now suppose you want to change the buildout environment somehow, e.g. use the current development version of zope.testing instead of whatever is specified in buildout.cfg. Don't edit the existing buildout.cfg (you might accidentally commit your local debug changes), instead create a new cfg file, e.g. test.cfg:

[buildout]
extends = buildout.cfg
develop += ../zope.testing

[versions]
# override any existing version pins
zope.testing =

Now re-run buildout

bin/buildout -c test.cfg
bin/test -pvc

And the tests should be run with the newest zope.testing.code.

Only this does not work with plone.z3cform, and I have no clue why. It generally works with other packages (at least those that use the zc.recipe.testrunner rather than collective.recipe.z2testrunner). Buildout is like that sometimes :(

posted at 20:26 | tags: | permanent link to this entry | 0 comments

Sat, 25 Jul 2009

Python-related updates for the last couple of months

Went to EuroPython, met new people, had a great time.

Updated gtkeggdeps, the interactive Python package dependency browser. Collaborated with Thomas Lotze, who maintains the engine (tl.eggdeps) that gtkeggdeps wraps, to resolve API mismatches. Moved the sources to launchpad.net, added a test suite, made it use zc.buildout for convenient development.

Moved the source repository of gtimelog, the simple desktop time tracker, to launchpad.net. Failed to do anything else with it. :-(

Tried to work on xdot, wrestled with git-svn merges, failed abysmally. Asked upstream to upload xdot to PyPI.

Released ZODB Browser, but this deserves a separate post.

Sent a bunch of pyflakes patches from my old branch upstream, created trac tickets for the rest. Wrestled with bzr-svn merges, failed abysmally.

posted at 02:14 | tags: | permanent link to this entry | 0 comments

Thu, 21 May 2009

Surprising old-style class behaviour

Some anonymous Planet Python poster (at least I couldn't find the author's name on the blog) Christian Wyglendowski asks about a surprising difference between old-style and new-style classes. Since the comments on their blog are closed (which you find out only after pressing Submit), I'll answer here.

The question, slightly paraphrased: given a class

class LameContainerOld:
    def __init__(self):
        self._items = {'bar':'test'}
 
    def __getitem__(self, name):
        return self._items[name]
 
    def __getattr__(self, attr):
        return getattr(self._items, attr)

why does the 'in' operator work

>>> container = LameContainerOld()
>>> 'foo' in container
False
>>> 'bar' in container
True

when the equivalent new-style class raises a KeyError: 0 exception? Also, why does __getattr__ appear to be called to get the bound __getitem__ method of the dict?

>>> container.__getitem__
<bound method LameContainerNew.__getitem__ of {'bar': 'test'}>

What actually happens here is that LameOldContainer.__getattr__ gets called for special methods such as __contains__ and __repr__. This is why (1) the 'in' check works, and (2) it appears, at first glance, that you get the wrong __getitem__ bound method. If you pay close attention to the output, you'll see that it's the __getitem__ of LameOldContainer; it's just that repr(LameOldContainer()) gets proxied through to the dict.__repr__ when you don't expect it:

>>> container
{'bar': 'test'}

Special methods never go through __getattr__ for new-style classes, therefore neither __contains__ nor __repr__ are proxied if you make the container inherit object. If there's no __contains__ method, Python falls back to the sequence protocol and starts calling __getitem__ for numbers 0 through infinity, or until it gets an IndexError exception.

posted at 22:12 | tags: | permanent link to this entry | 2 comments

Fri, 15 May 2009

Buildbot issues on Ubuntu Hardy

Update: The story continues, but solution is not in sight yet.

I upgraded a buildbot slave to Ubuntu 8.04 (Hardy) recently and now I'm getting a strange intermittent failure: sometimes cp -r /local/dir /nfs/mounted/dir fails ("process killed by signal 1", i.e. SIGHUP).

I wonder if NFS is relevant or incidental to the issue?

Google finds an old thread from 2005, with a workaround (usepty=False), but I'd like to understand the problem before applying random fixes.

So far three different build steps doing cp -r have failed during 10 days. I've now changed them all to cp -rv, so I can at least see if the failure is in the middle of the copy or at the end, if it fails again.

Update: so far 4 build steps have failed on 6 separate occasions:

May  5 02:31: cp -r local-dir1 nfs-mounted-dir1  
May  6 02:31: cp -r local-dir1 nfs-mounted-dir1  
May  6 04:33: cp -r local-dir2 nfs-mounted-dir2  
May 15 02:00: cp -r local-dir3 nfs-mounted-dir3  
May 17 04:32: rm -rf nfs-mounted-dir4            
May 20 04:31: rm -rf nfs-mounted-dir4            

I see no particular correlation between step duration and results, e.g. the rm -rf step usually takes between 2.2 and 4.6 seconds. The two SIGHUPs happened after 2.4 seconds.

They all make no output. When I changed the cp steps and added a -v, they stopped failing, but that could be just a coincidence.

We're having an email conversation with Jean-Paul Calderone ("exarkun") about the possibility of this being PTY-related, with no clear resolution so far.

And, hey, now this blog supports comments ;)

posted at 15:33 | tags: , , | permanent link to this entry | 1 comments

Fri, 08 May 2009

Expert Python Programming

It's been a while since the last Expert Python Programming review on Planet Python. Y'all might've forgotten about this book by now. Time for a reminder? (Actually, I'm just lazy busy, and this is why this review hasn't appeared sooner.)

I received a free PDF copy of this book from Packt Publishing, with the understanding that I'll post a review on my blog. This is it. Short summary: it is a good book marred by a lot of mostly inconsequential little mistakes. I'd give it four stars out of five.

Aside: the PDF that I could download was personalized and had my name and address in the footer of every page. A very nice form of DRM that did not restrict my software choices for reading the book (Evince and also PDF Reader on Nokia Internet Tablets).

I bring it up here because it seems that Packt could've also applied fixes for the known errata to the personalized version, yet missed that opportunity. Perhaps it's technically more difficult than slapping a footer on every page. Or maybe it's better if everyone buying the book, whether in paper or in PDF, gets to see the same text.

The author (Tarek Ziade) covers a wide range of topics in the book, ranging from syntax (probably useful for those who've been programming in Python for quite a few years, and didn't have the time to keep up with the language changes before picking up this book) to style, source code organization, project infrastructure, software life cycle, documentation, testing and optimization, and finally ending with a review of some of the popular design patterns. The middle parts were the most interesting for me personally. I learned a thing or two, disagreed with the author on a few minor points (which are mostly a matter of preference), and managed to finish the book despite constant irritating little pricks I feel when I notice an error (I confess I'm a pedant. A missing space after a colon drives me up the wall).

As an example of the disagreement: I have an aversion to code-generating tools where you have to edit the generated code by hand. I could say more, but this is a topic for another time. Next, I strongly dislike sudo easy_install since it scribbles onto the part of the filesystem exclusively reserved for your OS's package management tools. And I don't think porting the original 23 design patterns to other programming languages is a good way to describe what those languages are about. (Also, set tabstop=4 in your .vimrc? Heresy! The Right Thing To Do is set softtabstop=4, as all right-thinking Vim users will doubtlessly agree. All hail the one true text editor! Oh dear, now I'm glad I don't have comments on this blog...)

The goodies: Chapter 1 (the bits about PYTHONSTARTUP on page 19) gave me persistent history for my interactive Python prompt, nicely complementing the coloured prompt and tab-completion I already had snarfed from somewhere else on the net (probably Peter Norvig's Python IAQ). Chapter 12 provided good examples of how to do profiling for time (page 281) and memory (page 291). I like Tarek's @profile decorator (measure time, pystones and memory at the same time). My profilehooks module was not mentioned, *sniff* ;-). Chapter 13 told me about Queue.join and task_done that snuck into the stdlib with Python 2.5 without me noticing.

I haven't mentioned topics covered in the book that I was already familiar with, such as setuptools, virtualenv, zc.buildout, Sphinx, Nose, Buildbot, or Mercurial. Yet, in my opinion, those are the most useful parts of the book. The breadth of the topics is amazing: I could hardly think of something that every serious Python programmer should know that isn't wasn't mentioned. I believe the depth was exactly right: mention solutions that are available, show how they feel when used and what they can do, point to the relevant web page and then stop. And not only tools, the descriptions of workflows (how to organize your source trees, how to develop software consisting of multiple packages, how to make releases), while hardly universal, are invaluable.

One thing prevents this from being a perfect book: errata. At around page 95, according to my notes, I invented a new metric of book quality: WTFs per page, It's closely related to WTFs per minute, but independent of your reading speed. At around page 165 I got tired of making a note of every little thing that I noticed and started just reading. This was considerably more enjoyable. I hope there's a second edition will all the bugs shaken out. To that end, I should go through my notes again and submit them via the online errata form. Yay, more work...

posted at 06:09 | tags: , | permanent link to this entry | 0 comments

Mon, 06 Apr 2009

Submitting patches the Launchpad way

Today I happened to read about lazr.enum in a mailing list. I went to the PyPI page and saw raw ReStructuredText markup instead of a nicely formatted page. Now I know from prior experience that this happens when the package's description has an error in the markup. I thought I'd report a bug and provide a patch.

Leap of knowledge: since I know lazr.enum was created by the Launchpad.net team I could safely assume they were keeping the sources in Launchpad. Therefore I was pretty sure I could get them with

$ bzr branch lp:lazr.enum

so I ran that command and it worked.

Next I looked at setup.py to see how it produces the long_description field. It was reading the contents of a couple of text files, one of them being src/lazr/enum/README.txt. I looked at that and saw a .. toc-tree: directive that does not exists in plain docutils (it's a Sphinx extension).

I added up a couple of lines to setup.py to strip that out, tested it (with setup.py --long-description > test.rst; restview test.rst) committed to my local branch, and created a bug report in Launchpad. Then I was a bit lost, since I didn't know how to make my fix available. Attach a patch? Maybe, but I wanted to see if this distributed version control thing is good for anything else.

I thought that first I'd make that branch public, and then see if there was a way to link it to the bug report. I ran

$ bzr push lp:~mgedmin/lazr.enum/pypi-fix

which took a few seconds to create a new public branch on Launchpad with my fix in it (it would be nice if I didn't have to explicitly specify my Launchpad username and the project name—both of which bzr already knows—and just specify the name of the branch). Then I went back to my bug report and saw an option to link it to a branch. There was a search field in the popup that found my "~mgedmin/lazr.enum/pypi-fix" easily enough when I pasted it into the search box.

After clicking on the branch, I saw a "propose a merge" option. I did that and Launchpad sent an email to the developers asking them to merge my fix.

I made one mistake, I think: I should've created the bug report first, and then mentioned the bug number in my commit message (with bzr commit --fixes=NNN, although here I'm suddenly not sure if the bug number should be left bare, or prefixed with something like "lp" to indicate it was a Launchpad bug number?).

Other than that it was a pretty smooth experience. When will I be able to do that for Ubuntu packages?

posted at 00:45 | tags: , | permanent link to this entry | 0 comments

Thu, 12 Feb 2009

Keep the buildbot green!

This post started as a comment to Michael Rooney's question: Failing tests: When are they okay?, and then it became a bit too long for a comment.

picture of a green traffic light
Green light by morberg, cc:by-nc

For me the most important aspect of a build is to accurately represent my knowledge about the health of the product. New problems must be noticed as soon as possible. This won't happen if the developers are used to seeing (and ignoring) broken builds.

For this reason you want to distinguish known failures from unknown failures. For example, it's okay to commit a test that reproduces a bug even if you don't have a fix for that bug, but do it in a way that keeps the buildbot green. (Two common ways of doing that is marking the test in a special way so the test runner knows it's expected to fail, or disabling the test so that it doesn't even run.) The worst thing ever is fragile tests that fail only sometimes, especially if everyone grows accustomed to them. I speak from experience. I still have nightmares...

Collaboration is not reason enough to break the trunk. You can use branches or send patches via email, whichever works best. Patches are often simpler when you're taking over someone's unfinished work when that someone gets stuck and asks for help, or if you decide to switch machines when pair-programming. Sometimes I use shell one-liners like 'ssh othermachine svn diff /path/to/source/tree | patch -p42' to get the changes into my checkout. Branches are more appropriate for longer-term collaboration. It's perfectly fine to have a broken test suite on a branch -- you can always discard it; that's what you do to prototypes. Reimplementing something you've already done, in a cleaner fashion, is often a simple and rather pleasant way of merging.

If the tools you have aren't polished enough and you don't feel comfortable creating new branches even when they're necessary, invest a day every now and then improving your tools (shameless plug: eazysvn, because eazysvn switch -c newbranch does not require you to lose your train of thought remembering how to type long subversion URLs for svn cp).

That's all theory; in practice IMHO it's acceptable to take shortcuts. Small self-contained checkins are best (and this topic deserves a blog post of its own), but if you're forced to wait 20 minutes for the full test suite before every one of them, you won't use small checkins. It's fine to run just a subset of tests covering the code you've changed before every checkin, even if that means you sometimes will break the build by accident. However it's your responsibility to clean up any breakage if it occurs before you leave at the end of the day (or at least to feel guilty when you don't).

Back to the original question: I can imagine only one set of circumstances where the right thing to do is to knowingly commit a broken test to trunk. Imagine that you discovered a show-stopper bug, but the fix is elusive. By committing a failing test you force the whole team to notice it, drop everything else and work on the problem. And you also prevent somebody from accidentally releasing a broken version of the product. (Your release process includes a step ensuring that all the tests pass, right?)

posted at 00:40 | tags: | permanent link to this entry | 0 comments

Fri, 06 Feb 2009

Playing with Pylons

I've an opportunity to get to know Pylons. Here's an unsorted list of first (and second) impressions:

These aren't directly related to Pylons:

Update: About the naive notion that using easy_install instead of zc.buildout would help me avoid headaches? Muahahahahaha. Ha. Haha. Muahhaaaaaa. Wrong.

Also, TMTOWTDI is maybe too strong a word for SQLAlchemy's plethora of choices. And you really want to be using 0.5. And Pylons is even more awesome than I first thought. Obligatory grain of salt (*thud*): I haven't finished writing my first page yet. Integrating new stuff into existing elaborate functional test suites takes time.

posted at 22:54 | tags: , | permanent link to this entry | 0 comments

Tue, 27 Jan 2009

pydoc SyntaxError

I once needed to know about SyntaxError's attributes. Here's what pydoc SyntaxError from Python 2.5 says:

 |  Data descriptors defined here:
 |  
 |  filename
 |      exception filename
 |  
 |  lineno
 |      exception lineno
 |  
 |  message
 |      exception message

So far so good

 |  
 |  msg
 |      exception msg

Hmm?

 |  
 |  offset
 |      exception offset
 |  
 |  print_file_and_line
 |      exception print_file_and_line

Doh!

 |  
 |  text
 |      exception text

My, that was useful. Maybe the online documentation will be better?

exception SyntaxError

...

Instances of this class have attributes filename, lineno, offset and text for easier access to the details. str() of the exception instance returns only the message.

Um. Well, at least now I know I can ignore both 'msg' and 'message'. I think. Still, it would be nice to warn that sometimes the exception text can be multi-line.

posted at 19:23 | tags: | permanent link to this entry | 0 comments