Random notes from mg
Sat, 13 Mar 2010
I've been testing (as well as writing) Python code for the last eight years,
so a book with the words Begginer's Guide prominently displayed on
the cover isn't something I'd've decided to buy for myself. Nevertheless
I jumped at the offer of receiving a free e-copy for reviewing it.
Short summary: it's good book. I learned a thing or two
from it. I don't know well it would work as an introductionary text for
someone new to unit testing (or Python). Some of the bits seemed
overcomplicated and underexplained, parts of the example code/tests seemed to
contain design decisions received from mysterious sources.
Incidentally, Packt uses a simple yet
effective method for watermarking e-books: my name and street address are
displayed in the footer of every page. What's funny is that the two non-ASCII
characters in the street name are replaced with question marks. It's not a
data entry problem: the website that let me download those books shows my
address correctly, so it must be happening somewhere in the PDF production
process. I didn't expect this kind of Unicode buggyness from a publisher.
Then again there were occasional strange little typographical errors in the
text, like not leaving a space in front of an opening parenthesis in an English
sentence, or using a never-seen-before +q= operator in Python code. I
was also left wondering how the following sentence (page 225) could slip past
the editing process:
doctest ignores everything between the Traceback (most recent last call).
Thankfully those small mistakes did not detract from the overall message of
the book.
I liked the author's technique of showing subtly incorrect code, letting the
reader look at it and miss all the bugs, and then showing how unit or
integration tests catch the bugs the reader missed. I'm pretty sure there's at
least one remaining bug that the author missed in the example package (storing
a schedule doesn't erase old data), which could serve for a new chapter on
regression testing if there's a second edition.
Summary of topics covered:
- Terms: unit testing, integration testing, system testing.
- Basics of doctest and unittest, their strengths and weaknesses.
- Using mocks (with Mocker).
- Using Nose.
- Test-Driven Development with lots of example code.
- Using Twill.
- Integration testing with lots of example code.
- Using coverage
- Post-commit hooks to run tests with Bazaar, Mercurial, Git, Darcs,
Subversion.
- Continuous integration with Buildbot
I found the TDD cycle a bit larger than I generally like, but I believe it's
a matter of taste, and perhaps a shorter cycle wouldn't work as well in a
written medium.
I found it a bit jarring how the Twill chapter intrudes between the two
chapters showing unit testing and integration testing of the same sample
package. I think it would've been better to swap the order of chapters 8 and
9.
I liked the technique presented for picking subsets of the code for
integration tests, although I wonder how well it would work on a larger
project.
Topics not covered:
- Functional testing (which is very close but not exactly the same as
system testing).
- Regression testing (page 46 contains advice about this without mentioning
the term regression testing).
- Continuous integration with Hudson (simpler to set up than buildbot,
easily covers 80% of cases).
As you can see these holes are all rather small.
Probably the biggest weakness of the book is the complexity of some
things shown:
- writing mocks for pure unit tests
- mocking other instances of the same class under test
- even occasionally mocking self, which needs tricks like
calling a method's im_func directly
- mocking __reduce_ex__ so you can pickle mocks in an
integration test, instead of using real classes or simple
stubs.
- testing the same code multiple times: unit tests, several sets of
integration tests that test ever-increasing subsets of classes
- Buildbot instead of
Hudson
Seeing the repetitive and redundant mock code in the first few doctest
examples I started asking what's the point?, but the book failed to
provide a compelling answer (the answer provided—it's easier to locate
bugs—works just as well for integration tests that focus on individual
classes). And there are good answers for that question, like instant feedback
from your unit test suite. Are they worth the additional development effort?
Maybe that depends on the developer. I don't think they would help me, so I
tend to stick with low-level integration tests I call "unit tests" (as well as
system tests; it's always a mistake to keep all your tests in a single level).
I'm slightly worried that this book might give the wrong impression (testing is
hard) and turn away beginning Python programmers from writing tests
altogether.
Overall I do not feel that I have wasted my time reading Python
Testing. I look forward to reading the
other
reviews
that showed up on Planet Python. I gathered that not all reviewers were happy
with the book, but avoided reading their reviews in order not to influence my
own.
Sat, 06 Mar 2010
Yesterday I slashed 50% of run time from our applications functional test
suite by modifying a single function. I had no idea that function was
responsible for 50% of the run time until I started profiling.
Profiling a Python program is getting easier and easier:
$ python -m cProfile -o prof.data bin/test -f
runs our test runner (which is a Python script) under the profiler and stores
the results in prof.data.
$ runsnake prof.data
launches the RunSnakeRun
profile viewer, which displays the results visually:
The square map display of RunSnakeRun, with the 'render_restructured_text'
function highlighted.
Who knew that ReStructuredText rendering could be such a time waster? A
short caching decorator and the test suite is twice as fast. The whole
exercise took me less than an hour. I should've done it sooner.
Other neat tools:
- pstats
from the standard library lets you load and display profiler results from the
command line (try python -m pstats prof.data).
- pyprof2calltree
converts Python profiler data files to a format that the popular profiler
visualization tool kcachegrind
can understand. It's somewhat less useful now that RunSnakeRun exists.
- profilehooks by
yours truly has decorators for easily profiling individual functions instead
of entire scripts.
- keas.profile and repoze.profile hook
up the profiler as WSGI middleware for easy profiling of web apps.
Fri, 05 Mar 2010
Things I've taken up to do in the nearest future:
-
Read and review Python
Testing: Beginner's Guide and Grok
1.0 Web Development for Packt. (The links are trackable to my blog,
but I'm not getting anything out of it. Other than free copies of the
e-books, which I already received, in exchange for a promise to review them
on this blog.)
-
Help Reportlab
folks set up continuous integration (most likely Hudson, since Buildbot, while powerful, has a steep
learning curve).
-
Think about becoming the buildbotmaster for Zope. Originally I intended
to volunteer to set up a few buildbots for various Zopeish projects
(ZTK, BlueBream, Grok, Zope 2) since half of the existing
ones were down or broken. Then various people fixed some of the
broken ones and other people chimed in mentioning existing buildbots that
nobody else knew about. There is a need for somebody to coordinate all
this activity: make sure we have up-to-date test results for all kinds of
projects, aggregate them in one place, chase up build slaves for exotic
OSes (i.e. Windows)... I don't think I'm well suited for this kind of
organisational activity.
-
Push along the various scratch-my-itch open source projects (GTimeLog, irclog2html,
zodbbrowser).
-
No idea what, but I've been wanting to do something for Maemo. Something small, given the copious
amounts of free time I have.
-
Then there's the paying work. On the plus side, there are opportunities
for fun there (today I slashed functional test run time by a half, by
adding a small caching decorator in front of a single function.
RunSnakeRun
and cProfile
rule!)
-
You know what, scratch the Zope buildbotmaster idea. Maybe I can do
something technical there, e.g. a cron script to ping the various buildbot,
scrape HTML/parse emails and aggregate build results. Maybe.
-
I hope I don't get burnout
again. Because that would suck. Again. Been there, done that, didn't
even get a T-shirt.
I really ought to read Getting Things Done. Reading it has been on my
todo-list for years.
Wed, 03 Mar 2010
On Tuesday we started what will hopefully become a tradition: weekly IRC
meetings for Zope developers. Topics covered include buildbot organization and
maintenance, open issues with the ZTK development process, and the fate of Zope
3.5 (= BlueBream 1.0).
There are IRC logs of the meeting, and Christian Theune posted a summary
to the mailing list.
My take on this can be summed up as: Zope ain't dead yet! The project has
fragmented a bit (Zope 2, Zope Toolkit, Grok, BlueBream, Repoze), but we all
share a set of core packages and we want to keep them healthy.
Next meeting is also happening on a Tuesday, at 15:00 UTC on #zope in
FreeNode.
Thu, 07 Jan 2010
Michael Foord wrote about some
Latin-1 control character fun in a blog that's hard to read (the RSS feed
syndicated on Planet Python is truncated, grr!) and hard to reply (no comments
on the blog! my Chromium's AdBlock+ hid the comment link so I couldn't
find it), but never mind that.
Unfortunately the data from the customers included some \x85 characters,
which were breaking the CSV parsing.
0x85 is a control character (NEXT LINE or NEL) in Latin-1, but it's a
printable character (HORIZONTAL ELLIPSIS) in Microsoft's code page 1252, which
is often mistaken for Latin-1. I would venture a suggestion that the encoding
of the customer data was not latin-1 but rather cp1252.
>>> '\x85'.decode('cp1252')
u'\u2026'
Fri, 18 Dec 2009
Back in 2004 I wrote a small Gtk+ app to help me keep track of my time, and
called it GTimeLog. I shared it with
my coworkers, put it on the web (on the general "release early, release often"
principles), and it got sort-of popular before I found the time to polish it
into a state where I wouldn't be ashamed to show it to other people.
Fast-forward to 2008: there are actual users out there (much to my
surprise), I still haven't added the originally-envisioned spit and polish,
haven't done anything to foster a development community, am wracked by guilt of
not doing my maintainerly duties properly, which leads to depression and
burnout. So I do the only thing I can think of: run away from the project and
basically ignore its existence for a year. Unreviewed patches accumulate in my
inbox.
It seems that the sabbatical helped: yesterday, triggered by a new Debian bug report, I sat down,
fixed the bug, implemented a feature, applied a
couple of patches
languishing in the bug tracker, and released version 0.3 (which
was totally broken thanks to setuptools magic that suddenly stopped
working; so released 0.3.1 just now). Then went through my old unread email,
created bugs in Launchpad and sent
replies to everyone. Except Pierre-Luc
Beaudoin, since his @collabora.co.uk email address bounced. If anyone
knows how to contact him, I'd appreciate a note.

There are also some older changes that I made before I emerged out of the
funk and so hadn't widely announced:
-
There's a mailing
list for user and developer discussions (if there still are any ;).
-
GTimeLog's source code
now lives on Launchpad (actually, I mentioned this on my
blog once).
Wed, 09 Dec 2009
Unix is an IDE. I do my
development (Python web apps mostly) with Vim
with a bunch of custom plugins, shell
(in GNOME Terminal: tabs rule!), GNU make, ctags, find + grep,
svn/bzr/hg/git.
The current working directory is my project configuration/state. I run
tests here (bin/test), I search for code here (vim -t TagName, find + grep), I
run applications here (make run or bin/appname). I can multitask
freely, for example, if I'm in the middle of typing an SVN commit message, I
can hit Ctrl+Shit+T, get a new terminal tab in the same working directory, and
look something up. No aliases/environment variables/symlinks/scripts
making changes to config files. I can work on multiple projects at the
same time. I can work remotely (over ssh).
Gary Bernhardt's screencasts on
Vimeo show how productive you can get if you learn Vim and tailor it
to your needs. I have Vim scripts that let me
-
See the name of the class and function that I'm editing in the statusbar,
even if the class/function definition is offscreen:
pythonhelper.vim.
-
See all pyflakes warnings and errors in a list as soon as I press F2 to
save the file: python_check_syntax.vim.
-
Add a "from foo.bar import Something" line at the top of the file if I
press F5 when my cursor is on Something, looking up the package and module
from ctags: python-imports.vim.
-
Switch between production code and unit tests with a single key if the
project uses one of several conventions for tests (e.g. ./foo.py
<-> ./tests/test_foo.py):
py-test-switcher.vim.
-
Generate a command line for running one particular unit test (the one
my cursor is inside) and copy it into the system clipboard, so I can
run that test by Alt-Tabbing into my terminal window and pasting.
py-test-runner.vim.
-
Open the right file and move the cursor to the right line if I
triple-click a line of traceback in a shell (or an email) then press F7 in
my gvim window:
py-test-locator.vim.
-
Compare my version of the code with the pristine version in source control
in an interactive side-by-side diff that lets me revert bits I no longer
want:
vcscommand.vim.
-
Highlight which lines of the source are covered by my tests, if I have
coverage information in trace.py format:
py-coverage-highlight.vim.
-
Show the signature of a function/class's __init__ when I type the name
of that class/function and an open parenthesis (looked up from tags):
py-function-signature.vim.
-
Fold code into an outline so I only see names of methods or classes
instead of their full bodies:
vimrc, function PythonFoldLevel.
-
Fold diff files so I can see whole hunks/files and can delete those with
a single key (well, two keys -- dd). Useful for reviewing large
diffs (tens of thousands of lines):
vimrc, function DiffFoldLevel.
Some of these come from www.vim.org, some
I've written myself, some I've taken and modified a little bit to avoid an
irritating quirk or add a missing feature. Some things I don't have (and envy
Emacs or IDE users for having -- like an integrated debugger for Python apps,
and, generally, integration with other tools, running in the background).
It's been my plan for a long time to polish my plugins, release them
somewhere (github? bitbucket? launchpad?) and upload to vim.org, but as it
doesn't seem to be happening, I thought I'd at least put an svn
export of my ~/.vim on the web.
Tue, 01 Dec 2009
zope.schema has Text and TextLine. The former is for multiline text, the
latter is for a single line, as the name suggests. Zope 3 forms will use a
text area for Text fields and an input box for TextLine fields. Display
widgets, however, apply no special formatting (other than HTML-quoting of
characters like <, > and &), and since newlines are treated the same
way as spaces in HTML, your multiline text gets collapsed into a single
paragraph.
Here's a pattern I've been using in Zope 3 to display multiline user-entered
text as several paragraphs:
import cgi
from zope.component import adapts
from zope.publisher.browser import BrowserView
from zope.publisher.interfaces import IRequest
class SplitToParagraphsView(BrowserView):
"""Splits a string into paragraphs via newlines."""
adapts(None, IRequest)
def paragraphs(self):
if self.context is None:
return []
return filter(None, [s.strip() for s in self.context.splitlines()])
def __call__(self):
return "".join('<p>%s</p>\n' % cgi.escape(p)
for p in self.paragraphs())
View registration
<configure
xmlns="http://namespaces.zope.org/zope">
<view
for="*"
name="paragraphs"
type="zope.publisher.interfaces.browser.IBrowserRequest"
factory=".views.SplitToParagraphsView"
permission="zope.Public"
/>
</configure>
and usage
<p tal:replace="structure object/attribute/@@paragraphs" />
Update: The view really ought to be registered twice: once
for basestring and once for NoneType. I was too lazy to figure out the dotted
names for those (or check if zope.interface has external interface declarations
for them), so I registered it for "*". You should know that this makes the
view available for arbitrary objects (but won't work for most of them, since
they don't have a splitlines method), and that it is, sadly, accessible to
users who may try to hack your system by typing things like @@paragraphs in the
browser's address bar. Ignas Mikalajūnas offers an alternative
solution using TALES path adapters.
Mon, 21 Sep 2009
I'm at the point in my hobby project where I'd like to be able to change my models
without losing all my test data. And I'm too lazy to do manual dumps and edit
the SQL in place before reimporting it.
I want a system
- that is transparent to the user: if my database is at schema
version 1, and my code is at version 3, I want it to be automatically
upgraded to version 3 on server startup.
- that is not too hard on the programmer: dropping a numbered Python or SQL
script in a directory ought to be sufficient to define a transition from
schema version X to schema version X+1.
- that handles errors gracefully: makes a backup of the database
with the old schema version; runs my script in a transaction and aborts
that transaction if the conversion fails (while showing me enough
information to debug the problem).
- allows prototyping without having to increment the schema number for every
little change I make to the models; I should be the one who decides that a new
schema is ready to go out to the world.
I've been glancing at SQLAlchemy-Migrate, since I've
been brought up to believe NIHing is
Bad. But Migrate is scary. I have to admit that the longer I stare
at its documentation, the less I can describe why I think so. All
those shell commands—but there's an API for invoking them from Python, so maybe I can
achieve my goals. I'll have to try and see.
Tue, 15 Sep 2009
Last time I
mentioned that running bin/buildout with the -N flag makes it run faster
(since it skips looking for newer versions to upgrade). You can tell
buildout to do this by default by putting 'newest = false' into the [buildout]
section of buildout.cfg. We'll be running bin/buildout a lot now, since we'll
be making changes to the project environment, so this will save wear and tear
on the '-', 'N' and Shift keys. (And, by the way, I'm not trying to soak up
Google juice by repeating the word 'buildout' a lot, honest!)
I will omit bzr commits from this narrative as it's getting long; you can
assume that every self-contained change was committed separately.
tests
First, I want a bin/test script to run the test
suite. Pylons uses nose, so we need to tell buildout to install the nosetests
script (under a different name, since I'm used to typing bin/test no matter
what test runner a project happens to use):
$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-15 19:49:11 +0000
+++ buildout.cfg 2009-09-15 19:49:18 +0000
@@ -8,5 +8,8 @@
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
+ nose
asharing
interpreter = python
+scripts = paster
+ nosetests=test
$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/paster'.
Generated script '/tmp/AlliterationSharing/bin/test'.
...
$ bin/test
----------------------------------------------------------------------
Ran 0 tests in 0.276s
OK
ctags
Documentation is good, but sometimes you want to look at the source code of
the framework. There's a tool called ctags that builds a database of
identifiers. The popular text editors Vim
and Emacs can then use the
tags database to jump to a definition of any name with a single keystroke
(Ctrl-] in vim, M-. in emacs).
Building the tags database is complicated by each Python package being
installed into a separate directory. There's a buildout recipe called
z3c.recipe.tag that finds those directories and lets you build a unified tags
file. We'll also ask buildout to make sure it unzips any packages
distributed as .egg files, since ctags doesn't process those:
$ bzr diff
@@ -1,8 +1,9 @@
[buildout]
develop = .
-parts = pylons
+parts = pylons ctags
newest = false
+unzip = true
[pylons]
recipe = zc.recipe.egg
@@ -13,3 +14,7 @@
interpreter = python
scripts = paster
nosetests=test
+
+[ctags]
+recipe = z3c.recipe.tag:tags
+eggs = ${pylons:eggs}
$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/ctags'.
...
$ bin/ctags
omelette
ctags lets you find classes and functions by name; it doesn't let you find
packages or modules. There's another recipe, collective.recipe.omelette that
creates a tree of symlinks mirroring the Python package structure (here
'unzip = true' also comes in handy):
$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-15 20:04:42 +0000
+++ buildout.cfg 2009-09-15 20:05:30 +0000
@@ -1,6 +1,6 @@
[buildout]
develop = .
-parts = pylons ctags
+parts = pylons ctags omelette
newest = false
unzip = true
@@ -18,3 +18,7 @@
[ctags]
recipe = z3c.recipe.tag:tags
eggs = ${pylons:eggs}
+
+[omelette]
+recipe = collective.recipe.omelette
+eggs = ${pylons:eggs}
$ bin/buildout
...
$ ls -l parts/omelette
...
The symlink tree is created under parts/omelette/. For example, if you want
to see what webhelper tags were available, you can open
parts/omelette/webhelper/html/builder.py in your editor and see.
Makefile
This is getting long (and not everyone may be interested), but one long post is easier to skip than five
medium ones in a row, so I'll continue.
Wouldn't it be nice if new developers could check out your project and start
it up with just a couple of commands? Make is a time-tested tool that works
well for this:
$ cat Makefile
# Just remember that you need to use real tabs, not spaces, in a Makefile
PYTHON = python
.PHONY: all
all: bin/paster
.PHONY: run
run: bin/paster
bin/paster serve development.ini --reload
.PHONY: test check
test check: bin/test
bin/test
.PHONY: tags
tags: bin/ctags
bin/ctags
bin/paster bin/test bin/python bin/ctags: bin/buildout
bin/buildout
bin/buildout: bootstrap.py
$(PYTHON) bootstrap.py
Now all you need to do after checking out is run 'make' to set up a working
development environment. 'make run' or 'make test' will also do that, if
necessary, so this one-liner is sufficient to get a working Hello World
application on port 5000:
$ bzr branch lp:~mgedmin/+junk/AlliterationSharing && cd AlliterationSharing && make run
Try it! You'll get a Bazaar branch with all the history of this little
blog project.
Sun, 13 Sep 2009
For software development I prefer buildout to virtualenv. This is
because buildout has a text file describing the state of your working
environent, which can be versioned and used later to recreate it, as well
as during development to modify the environment slightly.
To start a new Pylons project, first create an empty directory. Let's
call our new project AlliterationSharing, because everybody is sick of 'foo'
and 'bar'.
$ mkdir -p ~/src/AlliterationSharing
$ cd ~/src/AlliterationSharing
Now create a file called buildout.cfg with the following content:
$ cat buildout.cfg
[buildout]
parts = pylons
[pylons]
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
interpreter = python
Download bootstrap.py to
it and run it to get bin/buildout. Note: you can chose which Python version you
want to use by running bootstrap.py with it. All other scripts under bin/
will be generated by buildout and will use the same Python interpreter.
$ wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py
$ python bootstrap.py
Creating directory '.../AlliterationSharing/bin'.
Creating directory '.../AlliterationSharing/parts'.
Creating directory '.../AlliterationSharing/eggs'.
Creating directory '.../AlliterationSharing/develop-eggs'.
Generated script '.../AlliterationSharing/bin/buildout'.
Run bin/buildout to install Pylons into your sandbox.
$ bin/buildout
Installing pylons.
Generated script '.../AlliterationSharing/bin/paster'.
Generated interpreter '.../AlliterationSharing/bin/python'.
Aside: buildout has this very nice feature where it can share Python
packages between projects. This will save you enormous amounts of time that
would otherwise be spent downloading and unpacking eggs. To make use of this
facility, create a file ~/.buildout/default.cfg with
$ cat ~/.buildout/default.cfg
[buildout]
eggs-directory = /home/mg/tmp/buildout-eggs
# XXX replace /home/mg with the full path of *your* home directory
# it would be much nicer if buildout let me use ~ or $HOME
# see https://bugs.launchpad.net/zc.buildout/+bug/190260
Another useful trick is to pass the -N flag to bin/buildout, which will tell
it not to bother looking for newer versions of packages on the Internet when
there's already an existing version installed in your eggs directory.
Back to business: now you've got two new scripts: bin/python and bin/paster.
You can use the first one to play with the interactive Python console where you
can now import pylons and all the dependencies; it has no other value.
Now is a good point to add the files you've created into a version control
system. I'll arbitrarily use Bazaar.
$ bzr init .
$ bzr add bootstrap.py buildout.cfg
$ bzr ignore bin parts eggs develop-eggs .installed.cfg
$ bzr commit -m "Create AlliterationSharing project"
Run bin/paster create -t pylons to create a skeleton project.
$ bin/paster create -t pylons asharing
$ bzr ignore *.egg-info
$ bzr add asharing
$ bzr commit -m "Generated project files with paster create"
Now paster creates a directory structure that I don't like:
AlliterationSharing/
buildout.cfg
bin/
asharing/
setup.py
README.txt
MANIFEST.in
asharing/
__init__.py
config/
controllers/
templates/
public/
I'd like the README and setup.py to be in the top level, and I dislike
repeating 'asharing' twice in directory names. I'll move some files around
$ cd asharing/
$ bzr mv development.ini docs MANIFEST.in README.txt setup.* test.ini ../
$ bzr rm ez_setup.*
$ cd ..
$ bzr mv asharing src
$ bzr ci -m "Moved some files around"
Now the tree looks like this:
AlliterationSharing/
buildout.cfg
setup.py
README.txt
MANIFEST.in
bin/
src/
asharing/
__init__.py
config/
controllers/
templates/
public/
We have to tell setup.py where to find the source tree
$ bzr diff
=== modified file 'MANIFEST.in'
--- MANIFEST.in 2009-09-13 13:04:00 +0000
+++ MANIFEST.in 2009-09-13 13:05:59 +0000
@@ -1,3 +1,3 @@
-include asharing/config/deployment.ini_tmpl
-recursive-include asharing/public *
-recursive-include asharing/templates *
+include src/asharing/config/deployment.ini_tmpl
+recursive-include src/asharing/public *
+recursive-include src/asharing/templates *
=== modified file 'setup.py'
--- setup.py 2009-09-13 13:04:00 +0000
+++ setup.py 2009-09-13 13:04:40 +0000
@@ -17,7 +17,8 @@
"SQLAlchemy>=0.5",
],
setup_requires=["PasteScript>=1.6.3"],
- packages=find_packages(exclude=['ez_setup']),
+ packages=find_packages('src', exclude=['ez_setup']),
+ package_dir={'': 'src'},
include_package_data=True,
test_suite='nose.collector',
package_data={'asharing': ['i18n/*/LC_MESSAGES/*.mo']},
(I'm not sure if you also need to change package_data and/or setup.cfg; it's
possible that I left i18n in a broken state. Can somebody comment on
this?)
And we have to tell buildout that we've got a new Python package to enable
in the project environment
$ bzr diff buildout.cfg
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-13 12:57:21 +0000
+++ buildout.cfg 2009-09-13 13:08:05 +0000
@@ -1,8 +1,10 @@
[buildout]
+develop = .
parts = pylons
[pylons]
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
+ asharing
interpreter = python
Now you can re-run bin/buildout and start your hello-world project
$ bzr commit -m "Include the new package in the build"
$ bin/buildout -N
$ bin/paster serve --reload development.ini
Happy hacking!
To be continued: telling buildbot to create bin/test; using ctags and omelette.
Mon, 03 Aug 2009
Most of Python packages in the Zope world use Buildout:
svn co svn+ssh://svn.zope.org/repos/main/plone.z3cform/trunk plone.z3cform
cd plone.z3cform
python2.4 bootstrap.py
bin/buildout
bin/test -pvc
Now suppose you want to change the buildout environment somehow, e.g.
use the current development version of zope.testing instead of whatever is
specified in buildout.cfg. Don't edit the existing buildout.cfg (you might
accidentally commit your local debug changes), instead create a new cfg file,
e.g. test.cfg:
[buildout]
extends = buildout.cfg
develop += ../zope.testing
[versions]
# override any existing version pins
zope.testing =
Now re-run buildout
bin/buildout -c test.cfg
bin/test -pvc
And the tests should be run with the newest zope.testing.code.
Only this does not work with plone.z3cform, and I have no clue why.
It generally works with other packages (at least those that use the
zc.recipe.testrunner rather than collective.recipe.z2testrunner).
Buildout is like that sometimes :(
Sat, 25 Jul 2009
Went to EuroPython, met new people,
had a great time.
Updated gtkeggdeps, the
interactive Python package dependency browser. Collaborated with Thomas Lotze, who maintains the engine
(tl.eggdeps) that
gtkeggdeps wraps, to resolve API mismatches. Moved the sources to launchpad.net, added a test
suite, made it use zc.buildout for convenient
development.
Moved the source repository of gtimelog, the simple desktop time
tracker, to launchpad.net.
Failed to do anything else with it. :-(
Tried to work on xdot, wrestled with
git-svn merges, failed abysmally. Asked
upstream to upload xdot to PyPI.
Released ZODB Browser, but
this deserves a separate post.
Sent a bunch of pyflakes patches from
my old
branch upstream, created trac
tickets for the rest. Wrestled with bzr-svn merges, failed abysmally.
Thu, 21 May 2009
Some anonymous Planet Python poster (at least I couldn't find the author's
name on the blog) Christian Wyglendowski asks
about a surprising difference between old-style and new-style classes.
Since the comments on their blog are closed (which you find out only after
pressing Submit), I'll answer here.
The question, slightly paraphrased: given a class
class LameContainerOld:
def __init__(self):
self._items = {'bar':'test'}
def __getitem__(self, name):
return self._items[name]
def __getattr__(self, attr):
return getattr(self._items, attr)
why does the 'in' operator work
>>> container = LameContainerOld()
>>> 'foo' in container
False
>>> 'bar' in container
True
when the equivalent new-style class raises a KeyError: 0 exception? Also, why
does __getattr__ appear to be called to get the bound __getitem__ method of the
dict?
>>> container.__getitem__
<bound method LameContainerNew.__getitem__ of {'bar': 'test'}>
What actually happens here is that LameOldContainer.__getattr__ gets called
for special methods such as __contains__ and __repr__. This is why (1) the
'in' check works, and (2) it appears, at first glance, that you get the wrong
__getitem__ bound method. If you pay close attention to the output, you'll
see that it's the __getitem__ of LameOldContainer; it's just that
repr(LameOldContainer()) gets proxied through to the dict.__repr__ when you
don't expect it:
>>> container
{'bar': 'test'}
Special methods never go through __getattr__ for new-style classes,
therefore neither __contains__ nor __repr__ are proxied if you make the
container inherit object. If there's no __contains__ method, Python falls back
to the sequence protocol and starts calling __getitem__ for numbers 0 through
infinity, or until it gets an IndexError exception.
Fri, 15 May 2009
Update: The story continues, but solution is not in sight
yet.
I upgraded a buildbot slave to Ubuntu 8.04 (Hardy) recently and now I'm
getting a strange intermittent failure: sometimes
cp -r /local/dir /nfs/mounted/dir fails
("process killed by signal 1", i.e. SIGHUP).
I wonder if NFS is relevant or incidental to the issue?
Google finds an old
thread from 2005, with a workaround (usepty=False), but I'd like to
understand the problem before applying random fixes.
So far three different build steps doing cp -r have failed during
10 days. I've now changed them all to cp -rv, so I can at least see
if the failure is in the middle of the copy or at the end, if it fails
again.
Update: so far 4 build steps have failed on 6 separate
occasions:
May 5 02:31: cp -r local-dir1 nfs-mounted-dir1
May 6 02:31: cp -r local-dir1 nfs-mounted-dir1
May 6 04:33: cp -r local-dir2 nfs-mounted-dir2
May 15 02:00: cp -r local-dir3 nfs-mounted-dir3
May 17 04:32: rm -rf nfs-mounted-dir4
May 20 04:31: rm -rf nfs-mounted-dir4
I see no particular correlation between step duration and results, e.g.
the rm -rf step usually takes between 2.2 and 4.6 seconds. The two SIGHUPs
happened after 2.4 seconds.
They all make no output. When I changed the cp steps and added a -v, they
stopped failing, but that could be just a coincidence.
We're having an email conversation with Jean-Paul Calderone ("exarkun")
about the possibility of this being PTY-related, with no clear resolution
so far.
And, hey, now this blog supports comments ;)
Fri, 08 May 2009
It's been a while since the last Expert Python
Programming review on Planet
Python. Y'all might've forgotten about this book by now. Time for a
reminder? (Actually, I'm just lazy busy, and this is why this review hasn't appeared
sooner.)
I received a free PDF copy of this book from Packt Publishing, with the
understanding that I'll post a review on my blog. This is it. Short summary:
it is a good book marred by a lot of mostly inconsequential little mistakes.
I'd give it four stars out of five.
Aside: the PDF that I could download was personalized and had my
name and address in the footer of every page. A very nice form of DRM that
did not restrict my software choices for reading the book (Evince and also
PDF Reader on Nokia Internet Tablets).
I bring it up here because it seems that Packt could've also applied fixes
for the
known errata to the personalized version, yet missed that opportunity.
Perhaps it's technically more difficult than slapping a footer on every page.
Or maybe it's better if everyone buying the book, whether in paper or in
PDF, gets to see the same text.
The author (Tarek Ziade) covers a wide range of topics in the book, ranging
from syntax (probably useful for those who've been programming in Python for
quite a few years, and didn't have the time to keep up with the language
changes before picking up this book) to style, source code organization,
project infrastructure, software life cycle, documentation, testing and
optimization, and finally ending with a review of some of the popular design
patterns. The middle parts were the most interesting for me personally. I
learned a thing or two, disagreed with the author on a few minor points (which
are mostly a matter of preference), and managed to finish the book despite
constant irritating little pricks I feel when I notice an error (I confess I'm
a pedant. A missing space after a colon drives me up the wall).
As an example of the disagreement: I have an aversion to code-generating
tools where you have to edit the generated code by hand. I could say more, but
this is a topic for another time. Next, I strongly dislike sudo
easy_install since it scribbles onto the part of the filesystem
exclusively reserved for your OS's package management tools. And I don't think
porting the original 23 design patterns to other programming languages is a
good way to describe what those languages are about. (Also, set
tabstop=4 in your .vimrc? Heresy! The Right Thing To Do is
set softtabstop=4, as all right-thinking Vim users will
doubtlessly agree. All hail the one true text editor! Oh dear, now I'm
glad I don't have comments on this blog...)
The goodies: Chapter 1 (the bits about PYTHONSTARTUP
on page 19) gave me persistent history for my interactive Python prompt, nicely
complementing the coloured prompt and tab-completion I already had snarfed from
somewhere else on the net (probably Peter Norvig's Python IAQ). Chapter 12
provided good examples of how to do profiling for time (page 281) and memory
(page 291). I like Tarek's @profile decorator (measure time, pystones
and memory at the same time). My profilehooks module was
not mentioned, *sniff* ;-). Chapter 13 told me about Queue.join
and task_done
that snuck into the stdlib with Python 2.5 without me noticing.
I haven't mentioned topics covered in the book that I was already familiar
with, such as setuptools, virtualenv, zc.buildout, Sphinx, Nose, Buildbot, or
Mercurial. Yet, in my opinion, those are the most useful parts of the book.
The breadth of the topics is amazing: I could hardly think of something that
every serious Python programmer should know that isn't wasn't mentioned. I
believe the depth was exactly right: mention solutions that are available, show
how they feel when used and what they can do, point to the relevant web page
and then stop. And not only tools, the descriptions of workflows (how to
organize your source trees, how to develop software consisting of multiple
packages, how to make releases), while hardly universal, are invaluable.
One thing prevents this from being a perfect book: errata. At around page
95, according to my notes, I invented a new metric of book quality: WTFs per
page, It's closely related to WTFs per minute, but
independent of your reading speed. At around page 165 I got tired of making a
note of every little thing that I noticed and started just reading. This was
considerably more enjoyable. I hope there's a second edition will all the bugs
shaken out. To that end, I should go through my notes again and submit them
via the online errata form. Yay, more work...
Mon, 06 Apr 2009
Today I happened to read about lazr.enum in a mailing list.
I went to the PyPI page and
saw raw ReStructuredText
markup instead of a nicely formatted page. Now I know from prior experience
that this happens when the package's description has an error in the markup.
I thought I'd report a bug and provide a patch.
Leap of knowledge: since I know lazr.enum was created by the Launchpad.net
team I could safely assume they were keeping the sources in Launchpad. Therefore
I was pretty sure I could get them with
$ bzr branch lp:lazr.enum
so I ran that command and it worked.
Next I looked at setup.py to see how it produces the long_description field.
It was reading the contents of a couple of text files, one of them being
src/lazr/enum/README.txt. I looked at that and saw a
.. toc-tree: directive that does not exists in plain docutils
(it's a Sphinx extension).
I added up a couple of lines to setup.py to strip that out, tested it
(with setup.py --long-description > test.rst; restview test.rst)
committed to my local branch, and created a bug report in Launchpad. Then I
was a bit lost, since I didn't know how to make my fix available. Attach a
patch? Maybe, but I wanted to see if this distributed version control thing is
good for anything else.
I thought that first I'd make that branch public, and then see if there
was a way to link it to the bug report. I ran
$ bzr push lp:~mgedmin/lazr.enum/pypi-fix
which took a few seconds to create a new public branch on Launchpad with my
fix in it (it would be nice if I didn't have to explicitly specify my Launchpad
username and the project name—both of which bzr already knows—and
just specify the name of the branch). Then I went back to my bug report and
saw an option to link it to a branch. There was a search field in the popup
that found my "~mgedmin/lazr.enum/pypi-fix" easily enough when I pasted it
into the search box.
After clicking on the branch, I saw a "propose a merge" option. I did that
and Launchpad sent an email to the developers asking them to merge my fix.
I made one mistake, I think: I should've created the bug report
first, and then mentioned the bug number in my commit message (with
bzr commit --fixes=NNN, although here I'm suddenly not sure if the bug
number should be left bare, or prefixed with something like "lp" to indicate it
was a Launchpad bug number?).
Other than that it was a pretty smooth experience. When will I be able to
do that for Ubuntu packages?
Thu, 12 Feb 2009
This post started as a comment to Michael Rooney's question: Failing
tests: When are they okay?, and then it became a bit too long for a
comment.
For me the most important aspect of a build is to accurately represent my
knowledge about the health of the product. New problems must be noticed as
soon as possible. This won't happen if the developers are used to seeing (and
ignoring) broken builds.
For this reason you want to distinguish known failures from unknown
failures. For example, it's okay to commit a test that reproduces a bug even
if you don't have a fix for that bug, but do it in a way that keeps the
buildbot green. (Two common ways of doing that is marking the test in a
special way so the test runner knows it's expected to fail, or disabling the
test so that it doesn't even run.) The worst thing ever is fragile
tests that fail only sometimes, especially if everyone grows accustomed to
them. I speak from experience. I still have nightmares...
Collaboration is not reason enough to break the trunk. You can use branches
or send patches via email, whichever works best. Patches are often simpler
when you're taking over someone's unfinished work when that someone gets stuck
and asks for help, or if you decide to switch machines when pair-programming.
Sometimes I use shell one-liners like 'ssh othermachine svn diff
/path/to/source/tree | patch -p42' to get the changes into my checkout.
Branches are more appropriate for longer-term collaboration. It's perfectly
fine to have a broken test suite on a branch -- you can always discard it;
that's what you do to prototypes. Reimplementing something you've already
done, in a cleaner fashion, is often a simple and rather pleasant way of
merging.
If the tools you have aren't polished enough and you don't feel comfortable
creating new branches even when they're necessary, invest a day every now and
then improving your tools (shameless plug: eazysvn, because eazysvn switch -c
newbranch does not require you to lose your train of thought remembering
how to type long subversion URLs for svn cp).
That's all theory; in practice IMHO it's acceptable to take shortcuts. Small self-contained
checkins are best (and this topic deserves a blog post of its own), but if
you're forced to wait 20 minutes for the full test suite before every one of
them, you won't use small checkins. It's fine to run just a subset of tests
covering the code you've changed before every checkin, even if that means you
sometimes will break the build by accident. However it's your responsibility
to clean up any breakage if it occurs before you leave at the end of the day
(or at least to feel guilty when you don't).
Back to the original question: I can imagine only one set of circumstances
where the right thing to do is to knowingly commit a broken test to trunk.
Imagine that you discovered a show-stopper bug, but the fix is elusive. By
committing a failing test you force the whole team to notice it, drop
everything else and work on the problem. And you also prevent somebody from
accidentally releasing a broken version of the product. (Your release process
includes a step ensuring that all the tests pass, right?)
Fri, 06 Feb 2009
I've an opportunity to get to know Pylons. Here's an unsorted list of first (and
second) impressions:
- Pylons has great documentation, though I did
stumble upon a few broken links
- Pylons has a great development environment (instant and automatic server
restarts; interactive Python console in your web browser on errors)
- It seems that nobody using Paste is interested in logging the startup and
shutdown time of the web server
- SQLAlchemy overwhelms with TMTOWTDI
- zc.buildout can be replaced by a 4-line shell script using virtualenv and
easy_install; this will save you headaches
- setuptools is made of pure crazyness, but we can't live without it
These aren't directly related to Pylons:
- distributed version control systems are great for throwaway prototypes
(especially when you want to compare several ways to do it)
- non-distributed version control systems aren't
- py.test is weird and takes some getting used to, but has some nice
properties as a test runner; shame about breaking compatibility with
unittest
- automated functional tests for system deployment in a freshly cloned Xen
virtual machine are cool, albeit slow-ish
Update: About the naive notion that using easy_install
instead of zc.buildout would help me avoid headaches? Muahahahahaha. Ha.
Haha. Muahhaaaaaa. Wrong.
Also, TMTOWTDI is maybe too strong a word for SQLAlchemy's plethora of
choices. And you really want to be using 0.5. And Pylons is even more awesome
than I first thought. Obligatory grain of salt (*thud*): I haven't finished
writing my first page yet. Integrating new stuff into existing elaborate
functional test suites takes time.
Tue, 27 Jan 2009
I once needed to know about SyntaxError's attributes. Here's what pydoc
SyntaxError from Python 2.5 says:
| Data descriptors defined here:
|
| filename
| exception filename
|
| lineno
| exception lineno
|
| message
| exception message
So far so good
|
| msg
| exception msg
Hmm?
|
| offset
| exception offset
|
| print_file_and_line
| exception print_file_and_line
Doh!
|
| text
| exception text
My, that was useful. Maybe the online documentation
will be better?
- exception SyntaxError
-
...
Instances of this class have attributes filename, lineno, offset and text for
easier access to the details. str() of the exception instance returns only
the message.
Um. Well, at least now I know I can ignore both 'msg' and 'message'. I
think. Still, it would be nice to warn that sometimes the exception text can be
multi-line.