Random notes from mg

a blog by Marius Gedminas

Marius is a Python hacker. He works for Programmers of Vilnius, a small Python/Zope 3 startup. He has a personal home page at http://gedmin.as. His email is marius@gedmin.as. He does not like spam, but is not afraid of it.

Sat, 07 Aug 2010

Profiling with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more. I've talked about log capturing, now it's time for

Profiling

This WSGI middleware profiles every request with the cProfile module. To see the profiles, visit a hidden URL /_profiler/showall:

List of profiles

What you see here is heavily tweaked in my fork of Dozer; upstream version had no Cost column and didn't vary the background of Time by age (that last bit helps me see clumps of requests).

Here's what an individual profile looks like:

One profile

The call tree nodes can be expanded and collapsed by clicking on the function name. There's a hardcoded limit of 20 nesting levels (upstream had a limit of 15), sadly that appears not to be enough for practical purposes, especially if you start profiling Zope 3 applications...

You can also take a look at the WSGI environment:

WSGI environment expanded

Sadly, nothing about the response is captured by Dozer. I'd've liked to show the Content-Type and perhaps Content-Length in the profile list.

The incantation in development.ini is

[filter-app:profile]
use = egg:Dozer#profile
profile_path = /tmp/profiles
next = main

Create an empty directory /tmp/profiles and make sure other users cannot write to it. Dozer stores captured profiles as Python pickles, which are insecure and allow arbitrary command execution.

To enable the profiler, run paster like this:

$ paster serve development.ini -n profile

Bonus feature: call graphs

Dozer also writes a call graph in Graphviz "dot" format in the profile directory. Here's the graph corresponding to the profile you saw earlier, as displayed by the excellent XDot:

Call graph

See the fork where the "hot" red path splits into two?

Call graph, zoomed in

On the left we have Routes deciding to spend 120 ms (70% total time) recompiling its route maps. On the right we have the actual request dispatch. The actual controller action is called a bit further down:

Call graph, zoomed in

Here it is, highlighted. 42 ms (24% total time), almost all of which is spent in SQLAlchemy, loading the model object (a 2515 byte image stored as a blob) from SQLite.

A mystery: pickle errors

When I first tried to play with the Dozer profiler, I was attacked by innumerable exceptions. Some of those were due to a lack of configuration (profile_path) or invalid configuration (directory not existing), or now knowing the right URL (going to /_profiler raised TypeError). I tried to make Dozer's profiler more forgiving or at least produce clearer error messages in my fork, e.g. going to /_profiler now displays the profile list.

However some errors were very mysterious: some pickles, written by Dozer itself, could not be unpickled. I added a try/except that put those at the end of the list, so you can see and delete them.

Pickle errors

Does anybody have any clues as to why profile.py might be writing out broken pickles?

posted at 06:06 | tags: | permanent link to this entry | 4 comments

Capturing logs with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more:

Log capturing

This WSGI middleware intercepts logging calls for every request. Here we see a toy Pylons application I've been working on in my spare time. Dozer added an info bar at the top:

Dozer's infobar

When you click on it, you get to see all the log messages produced for this request. I've set SQLAlchemy's loglevel to INFO in my development.ini, which produces:

Dozer's log viewer

(Why on Earth does SQLAlchemy think I want to see the memory address of the Engine object in my log files, I don't know. The parentheses contain argument values for parametrized queries, of which there are none on this page.)

Upstream version displays absolute timestamps (of the YYYY-MM-DD HH:MM:SS.ssssss variety) in the first column; my fork shows deltas in milliseconds. The incantation in development.ini is

[filter-app:logview]
use = egg:Dozer#logview
next = main

which makes it disabled by default. To enable, you run paster like this:

$ paster serve development.ini -n logview

(Upstream version lacks the paste entry point for logview; it's in my fork, for which I submitted a pull request weeks ago like a good open-source citizen. Incidentally, patches for stuff I maintain have been known to languish for years in my inbox, so I'm not one to throw stones.)

Coming soon: profiling with Dozer.

posted at 05:27 | tags: | permanent link to this entry | 5 comments

Fri, 06 Aug 2010

irclog2html is now on PyPI

irclog2html, the IRC log to HTML converter, is now (finally!) available from the Python Package Index.

In other news, logs2html now copies irclog.css to the destination directory (if it doesn't exist there already). I've been noticing logs produced with irclog2html on random places, and sometimes they were unstyled; hopefully this will become rare now.

posted at 04:32 | tags: | permanent link to this entry | 0 comments

Wed, 07 Jul 2010

ImportError: No module named _md5

If you're using virtualenv, and after a system upgrade you get errors like

...
  File "...", line ...
    from hashlib import md5
  File "/usr/lib/python2.6/hashlib.py", line 63, in __get_builtin_constructor
     import _md5
ImportError: No module named _md5

this means that the copy of the python executable in your virtualenv/bin directory is outdated and you should update it:

$ cp /usr/bin/python2.6 /path/to/venv/bin/python

or, better yet, recreate the virtualenv.

posted at 01:20 | tags: , | permanent link to this entry | 1 comments

Sun, 18 Apr 2010

Re: Web frameworks considered useful.

Martijn Faassen defends web frameworks in a rather longish post (you can tell it's 5 AM in the morning and I've nearly defeated the unread post queue in Google Reader). I'd like to propose a condensed version. Consider this slogan:

Simple things should be easy; complicated things should be possible.

Frameworks make simple things easy. Good frameworks keep the complicated thing possible; poorly-designed frameworks make the complicated thing more difficult than necessary; bad frameworks make even simple things complicated.

Doing everything from scratch merely makes things possible, but rarely easy.

posted at 04:57 | tags: | permanent link to this entry | 3 comments

Sun, 04 Apr 2010

Review: Grok 1.0 Web Development

Disclaimer: I received a free review copy of this book. The book links are affiliate links; I get a small amount from any purchase you make through them.

Grok is a Python web framework, built on top of the Zope Toolkit, which is the core of what used to be called Zope 3 and is now rebranded as BlueBream. Confused yet? Get used to it: the small pluggable components are the heart and soul of ZTK, and the source of its flexibility. It's not surprising that people take the same approach on a larger scale: take Zope 3 apart into smaller packages and reassemble them into different frameworks such as Grok, BlueBream or repoze.bfg.

Grok 1.0 Web Development by Carlos de la Guardia

The Grok book by Carlos de la Guardia introduces the framework by demonstrating how to create a small but realistic To-do list manager. I like this technique, and it works pretty well. The author covers many topics:

Some important topics like internationalization, time zones, testing with Selenium, and (especially) database migration (which is pretty specific for ZODB) were not covered.

If you want to learn about Grok, this book will be useful, but there's a caveat: there's the usual slew of typographical mistakes and other errors I've come to expect from books published by Packt. It's their third book I've seen; all three had surprisingly high numbers of errors. Some had more, others had fewer. The Grok book was on the high side and the first one where I was tempted to record a "WTFs per page" metric. The mistakes are easy to notice and correct, so they didn't impede my understanding of the book's content. Disclaimer: I've been working with Zope 3 for the last six-or-so years, so I was pretty familiar with the underlying technologies, just not the thin Grok convenience layer. If minor errors annoy you, stay away. I haven't noticed any major factual errors, although there were what I would consider some pretty important omissions:

Overall, Grok is pretty nice, especially compared to vanilla Zope 3. However, when compared to frameworks like Pylons or Django, Grok appears more complex and seemingly requires you to do additional work for unclear gain. For example, chapter 8 has you writing three components for every new form you add: one for the form itself, one for a pagelet wrapping the form, and one for a page containing the pagelet. Most of that code is very similar with only the names being different. I'm sure there are situations where this kind of extreme componentization pays off (e.g. it lets you override particular bits on particular pages to satisfy a particular client's requests, without affecting any other clients), but the book doesn't convincingly demonstrate those advantages. Again, I may be biased here since I've been enjoying those advantages for the past six years, without ever having felt the pain of doing similar customizations with a less flexible framework. (It's a gap in my professional experience that I'm itching to fill.)

Update: some other reviews on Planet Python.

Update 2: Another review (well, part 1 of one, but I got tired waiting for part 2).

posted at 22:30 | tags: , , | permanent link to this entry | 3 comments

Sat, 13 Mar 2010

Review: Python Testing: Beginner's Guide

I've been testing (as well as writing) Python code for the last eight years, so a book with the words Begginer's Guide prominently displayed on the cover isn't something I'd've decided to buy for myself. Nevertheless I jumped at the offer of receiving a free e-copy for reviewing it.

Python Testing: Beginner's Guide by Danien Arbuckle

Short summary: it's good book. I learned a thing or two from it. I don't know well it would work as an introductionary text for someone new to unit testing (or Python). Some of the bits seemed overcomplicated and underexplained, parts of the example code/tests seemed to contain design decisions received from mysterious sources.

Incidentally, Packt uses a simple yet effective method for watermarking e-books: my name and street address are displayed in the footer of every page. What's funny is that the two non-ASCII characters in the street name are replaced with question marks. It's not a data entry problem: the website that let me download those books shows my address correctly, so it must be happening somewhere in the PDF production process. I didn't expect this kind of Unicode buggyness from a publisher. Then again there were occasional strange little typographical errors in the text, like not leaving a space in front of an opening parenthesis in an English sentence, or using a never-seen-before +q= operator in Python code. I was also left wondering how the following sentence (page 225) could slip past the editing process:

doctest ignores everything between the Traceback (most recent last call).

Thankfully those small mistakes did not detract from the overall message of the book.

I liked the author's technique of showing subtly incorrect code, letting the reader look at it and miss all the bugs, and then showing how unit or integration tests catch the bugs the reader missed. I'm pretty sure there's at least one remaining bug that the author missed in the example package (storing a schedule doesn't erase old data), which could serve for a new chapter on regression testing if there's a second edition.

Summary of topics covered:

I found the TDD cycle a bit larger than I generally like, but I believe it's a matter of taste, and perhaps a shorter cycle wouldn't work as well in a written medium.

I found it a bit jarring how the Twill chapter intrudes between the two chapters showing unit testing and integration testing of the same sample package. I think it would've been better to swap the order of chapters 8 and 9.

I liked the technique presented for picking subsets of the code for integration tests, although I wonder how well it would work on a larger project.

Topics not covered:

As you can see these holes are all rather small.

Probably the biggest weakness of the book is the complexity of some things shown:

Seeing the repetitive and redundant mock code in the first few doctest examples I started asking what's the point?, but the book failed to provide a compelling answer (the answer provided—it's easier to locate bugs—works just as well for integration tests that focus on individual classes). And there are good answers for that question, like instant feedback from your unit test suite. Are they worth the additional development effort? Maybe that depends on the developer. I don't think they would help me, so I tend to stick with low-level integration tests I call "unit tests" (as well as system tests; it's always a mistake to keep all your tests in a single level). I'm slightly worried that this book might give the wrong impression (testing is hard) and turn away beginning Python programmers from writing tests altogether.

Overall I do not feel that I have wasted my time reading Python Testing. I look forward to reading the other reviews that showed up on Planet Python. I gathered that not all reviewers were happy with the book, but avoided reading their reviews in order not to influence my own.

Update: I especially liked this review by Brian Jones. The lack of awkward page breaks in code examples is something that I only noticed after reading a different book, which is full of such awkward breaks, sigh.

Update 2: The book links are now affiliate links; I get a small amount from any purchase you make through them.

posted at 21:54 | tags: , | permanent link to this entry | 3 comments

Sat, 06 Mar 2010

You've got to love profiling

Yesterday I slashed 50% of run time from our applications functional test suite by modifying a single function. I had no idea that function was responsible for 50% of the run time until I started profiling.

Profiling a Python program is getting easier and easier:

$ python -m cProfile -o prof.data bin/test -f

runs our test runner (which is a Python script) under the profiler and stores the results in prof.data.

$ runsnake prof.data

launches the RunSnakeRun profile viewer, which displays the results visually:

RunSnakeRun square map display
The square map display of RunSnakeRun, with the 'render_restructured_text' function highlighted.

Who knew that ReStructuredText rendering could be such a time waster? A short caching decorator and the test suite is twice as fast. The whole exercise took me less than an hour. I should've done it sooner.

Other neat tools:

posted at 20:49 | tags: | permanent link to this entry | 6 comments

Fri, 05 Mar 2010

Bye, bye, free time!

Things I've taken up to do in the nearest future:

I really ought to read Getting Things Done. Reading it has been on my todo-list for years.

posted at 23:02 | tags: | permanent link to this entry | 4 comments

Wed, 03 Mar 2010

Weekly Zope developer IRC meetings

On Tuesday we started what will hopefully become a tradition: weekly IRC meetings for Zope developers. Topics covered include buildbot organization and maintenance, open issues with the ZTK development process, and the fate of Zope 3.5 (= BlueBream 1.0).

There are IRC logs of the meeting, and Christian Theune posted a summary to the mailing list.

My take on this can be summed up as: Zope ain't dead yet! The project has fragmented a bit (Zope 2, Zope Toolkit, Grok, BlueBream, Repoze), but we all share a set of core packages and we want to keep them healthy.

Next meeting is also happening on a Tuesday, at 15:00 UTC on #zope in FreeNode.

posted at 13:09 | tags: , | permanent link to this entry | 0 comments

Thu, 07 Jan 2010

Latin-1 or Windows-1252?

Michael Foord wrote about some Latin-1 control character fun in a blog that's hard to read (the RSS feed syndicated on Planet Python is truncated, grr!) and hard to reply (no comments on the blog! my Chromium's AdBlock+ hid the comment link so I couldn't find it), but never mind that.

Unfortunately the data from the customers included some \x85 characters, which were breaking the CSV parsing.

0x85 is a control character (NEXT LINE or NEL) in Latin-1, but it's a printable character (HORIZONTAL ELLIPSIS) in Microsoft's code page 1252, which is often mistaken for Latin-1. I would venture a suggestion that the encoding of the customer data was not latin-1 but rather cp1252.

>>> '\x85'.decode('cp1252')
u'\u2026'
posted at 23:29 | tags: | permanent link to this entry | 3 comments

Fri, 18 Dec 2009

GTimeLog: not dead yet!

Back in 2004 I wrote a small Gtk+ app to help me keep track of my time, and called it GTimeLog. I shared it with my coworkers, put it on the web (on the general "release early, release often" principles), and it got sort-of popular before I found the time to polish it into a state where I wouldn't be ashamed to show it to other people.

Fast-forward to 2008: there are actual users out there (much to my surprise), I still haven't added the originally-envisioned spit and polish, haven't done anything to foster a development community, am wracked by guilt of not doing my maintainerly duties properly, which leads to depression and burnout. So I do the only thing I can think of: run away from the project and basically ignore its existence for a year. Unreviewed patches accumulate in my inbox.

It seems that the sabbatical helped: yesterday, triggered by a new Debian bug report, I sat down, fixed the bug, implemented a feature, applied a couple of patches languishing in the bug tracker, and released version 0.3 (which was totally broken thanks to setuptools magic that suddenly stopped working; so released 0.3.1 just now). Then went through my old unread email, created bugs in Launchpad and sent replies to everyone. Except Pierre-Luc Beaudoin, since his @collabora.co.uk email address bounced. If anyone knows how to contact him, I'd appreciate a note.

version is now shown in the about dialog

There are also some older changes that I made before I emerged out of the funk and so hadn't widely announced:

posted at 01:22 | tags: , | permanent link to this entry | 5 comments

Wed, 09 Dec 2009

Unix is an IDE, or my Vim plugins

Unix is an IDE. I do my development (Python web apps mostly) with Vim with a bunch of custom plugins, shell (in GNOME Terminal: tabs rule!), GNU make, ctags, find + grep, svn/bzr/hg/git.

The current working directory is my project configuration/state. I run tests here (bin/test), I search for code here (vim -t TagName, find + grep), I run applications here (make run or bin/appname). I can multitask freely, for example, if I'm in the middle of typing an SVN commit message, I can hit Ctrl+Shift+T, get a new terminal tab in the same working directory, and look something up. No aliases/environment variables/symlinks/scripts making changes to config files. I can work on multiple projects at the same time. I can work remotely (over ssh).

Gary Bernhardt's screencasts on Vimeo show how productive you can get if you learn Vim and tailor it to your needs. I have Vim scripts that let me

Some of these come from www.vim.org, some I've written myself, some I've taken and modified a little bit to avoid an irritating quirk or add a missing feature. Some things I don't have (and envy Emacs or IDE users for having -- like an integrated debugger for Python apps, and, generally, integration with other tools, running in the background).

It's been my plan for a long time to polish my plugins, release them somewhere (github? bitbucket? launchpad?) and upload to vim.org, but as it doesn't seem to be happening, I thought I'd at least put an svn export of my ~/.vim on the web.

posted at 01:23 | tags: , , | permanent link to this entry | 5 comments

Tue, 01 Dec 2009

Displaying multiline text in Zope 3

zope.schema has Text and TextLine. The former is for multiline text, the latter is for a single line, as the name suggests. Zope 3 forms will use a text area for Text fields and an input box for TextLine fields. Display widgets, however, apply no special formatting (other than HTML-quoting of characters like <, > and &), and since newlines are treated the same way as spaces in HTML, your multiline text gets collapsed into a single paragraph.

Here's a pattern I've been using in Zope 3 to display multiline user-entered text as several paragraphs:

import cgi

from zope.component import adapts
from zope.publisher.browser import BrowserView
from zope.publisher.interfaces import IRequest


class SplitToParagraphsView(BrowserView):
    """Splits a string into paragraphs via newlines."""

    adapts(None, IRequest)

    def paragraphs(self):
        if self.context is None:
            return []
        return filter(None, [s.strip() for s in self.context.splitlines()])

    def __call__(self):
        return "".join('<p>%s</p>\n' % cgi.escape(p)
                        for p in self.paragraphs())

View registration

<configure
    xmlns="http://namespaces.zope.org/zope">

  <view
      for="*"
      name="paragraphs"
      type="zope.publisher.interfaces.browser.IBrowserRequest"
      factory=".views.SplitToParagraphsView"
      permission="zope.Public"
      />

</configure>

and usage

<p tal:replace="structure object/attribute/@@paragraphs" />

Update: The view really ought to be registered twice: once for basestring and once for NoneType. I was too lazy to figure out the dotted names for those (or check if zope.interface has external interface declarations for them), so I registered it for "*". You should know that this makes the view available for arbitrary objects (but won't work for most of them, since they don't have a splitlines method), and that it is, sadly, accessible to users who may try to hack your system by typing things like @@paragraphs in the browser's address bar. Ignas Mikalajūnas offers an alternative solution using TALES path adapters.

posted at 20:52 | tags: , | permanent link to this entry | 1 comments

Mon, 21 Sep 2009

Pylons and SQL schema migration

I'm at the point in my hobby project where I'd like to be able to change my models without losing all my test data. And I'm too lazy to do manual dumps and edit the SQL in place before reimporting it.

I want a system

I've been glancing at SQLAlchemy-Migrate, since I've been brought up to believe NIHing is Bad. But Migrate is scary. I have to admit that the longer I stare at its documentation, the less I can describe why I think so. All those shell commands—but there's an API for invoking them from Python, so maybe I can achieve my goals. I'll have to try and see.

posted at 19:44 | tags: , , | permanent link to this entry | 9 comments

Tue, 15 Sep 2009

Pylons with zc.buildout, continued

Last time I mentioned that running bin/buildout with the -N flag makes it run faster (since it skips looking for newer versions to upgrade). You can tell buildout to do this by default by putting 'newest = false' into the [buildout] section of buildout.cfg. We'll be running bin/buildout a lot now, since we'll be making changes to the project environment, so this will save wear and tear on the '-', 'N' and Shift keys. (And, by the way, I'm not trying to soak up Google juice by repeating the word 'buildout' a lot, honest!)

I will omit bzr commits from this narrative as it's getting long; you can assume that every self-contained change was committed separately.

tests

First, I want a bin/test script to run the test suite. Pylons uses nose, so we need to tell buildout to install the nosetests script (under a different name, since I'm used to typing bin/test no matter what test runner a project happens to use):

$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-15 19:49:11 +0000
+++ buildout.cfg	2009-09-15 19:49:18 +0000
@@ -8,5 +8,8 @@
 recipe = zc.recipe.egg
 eggs = Pylons
        PasteScript
+       nose
        asharing
 interpreter = python
+scripts = paster
+          nosetests=test

$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/paster'.
Generated script '/tmp/AlliterationSharing/bin/test'.
...
$ bin/test

----------------------------------------------------------------------
Ran 0 tests in 0.276s

OK

ctags

Documentation is good, but sometimes you want to look at the source code of the framework. There's a tool called ctags that builds a database of identifiers. The popular text editors Vim and Emacs can then use the tags database to jump to a definition of any name with a single keystroke (Ctrl-] in vim, M-. in emacs).

Building the tags database is complicated by each Python package being installed into a separate directory. There's a buildout recipe called z3c.recipe.tag that finds those directories and lets you build a unified tags file. We'll also ask buildout to make sure it unzips any packages distributed as .egg files, since ctags doesn't process those:

$ bzr diff
@@ -1,8 +1,9 @@
 [buildout]
 develop = .
-parts = pylons
+parts = pylons ctags
 
 newest = false
+unzip = true
 
 [pylons]
 recipe = zc.recipe.egg
@@ -13,3 +14,7 @@
 interpreter = python
 scripts = paster
           nosetests=test
+
+[ctags]
+recipe = z3c.recipe.tag:tags
+eggs = ${pylons:eggs}

$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/ctags'.
...
$ bin/ctags

omelette

ctags lets you find classes and functions by name; it doesn't let you find packages or modules. There's another recipe, collective.recipe.omelette that creates a tree of symlinks mirroring the Python package structure (here 'unzip = true' also comes in handy):

$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-15 20:04:42 +0000
+++ buildout.cfg	2009-09-15 20:05:30 +0000
@@ -1,6 +1,6 @@
 [buildout]
 develop = .
-parts = pylons ctags
+parts = pylons ctags omelette
 
 newest = false
 unzip = true
@@ -18,3 +18,7 @@
 [ctags]
 recipe = z3c.recipe.tag:tags
 eggs = ${pylons:eggs}
+
+[omelette]
+recipe = collective.recipe.omelette
+eggs = ${pylons:eggs}

$ bin/buildout 
...
$ ls -l parts/omelette
...

The symlink tree is created under parts/omelette/. For example, if you want to see what webhelper tags were available, you can open parts/omelette/webhelper/html/builder.py in your editor and see.

Makefile

This is getting long (and not everyone may be interested1), but one long post is easier to skip than five medium ones in a row, so I'll continue.

1 Sorry, Planet Maemo! There's an RSS feed of posts tagged 'maemo', if you can figure out the URL, which is very well hidden by PyBlosxom, *sigh*.

Wouldn't it be nice if new developers could check out your project and start it up with just a couple of commands? Make is a time-tested tool that works well for this:

$ cat Makefile
# Just remember that you need to use real tabs, not spaces, in a Makefile

PYTHON = python

.PHONY: all
all: bin/paster

.PHONY: run
run: bin/paster
        bin/paster serve development.ini --reload

.PHONY: test check
test check: bin/test
        bin/test

.PHONY: tags
tags: bin/ctags
        bin/ctags

bin/paster bin/test bin/python bin/ctags: bin/buildout
        bin/buildout

bin/buildout: bootstrap.py
        $(PYTHON) bootstrap.py

Now all you need to do after checking out is run 'make' to set up a working development environment. 'make run' or 'make test' will also do that, if necessary, so this one-liner is sufficient to get a working Hello World application on port 5000:

$ bzr branch lp:~mgedmin/+junk/AlliterationSharing && cd AlliterationSharing && make run

Try it! You'll get a Bazaar branch with all the history of this little blog project.

posted at 23:31 | tags: , , | permanent link to this entry | 7 comments

Sun, 13 Sep 2009

Starting a Pylons project with zc.buildout

For software development I prefer buildout to virtualenv. This is because buildout has a text file describing the state of your working environent, which can be versioned and used later to recreate it, as well as during development to modify the environment slightly.

To start a new Pylons project, first create an empty directory. Let's call our new project AlliterationSharing1, because everybody is sick of 'foo' and 'bar'.

1 Generated by randomly picking two words from /usr/share/dict/words, then chosen over among 120 other variants that weren't as good.

$ mkdir -p ~/src/AlliterationSharing
$ cd ~/src/AlliterationSharing

Now create a file called buildout.cfg with the following content:

$ cat buildout.cfg
[buildout]
parts = pylons

[pylons]
recipe = zc.recipe.egg
eggs = Pylons
       PasteScript
interpreter = python

Download bootstrap.py to it and run it to get bin/buildout. Note: you can chose which Python version you want to use by running bootstrap.py with it. All other scripts under bin/ will be generated by buildout and will use the same Python interpreter.

$ wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py
$ python bootstrap.py
Creating directory '.../AlliterationSharing/bin'.
Creating directory '.../AlliterationSharing/parts'.
Creating directory '.../AlliterationSharing/eggs'.
Creating directory '.../AlliterationSharing/develop-eggs'.
Generated script '.../AlliterationSharing/bin/buildout'.

Run bin/buildout to install Pylons into your sandbox.

$ bin/buildout
Installing pylons.
Generated script '.../AlliterationSharing/bin/paster'.
Generated interpreter '.../AlliterationSharing/bin/python'.

Aside: buildout has this very nice feature where it can share Python packages between projects. This will save you enormous amounts of time that would otherwise be spent downloading and unpacking eggs. To make use of this facility, create a file ~/.buildout/default.cfg with

$ cat ~/.buildout/default.cfg 
[buildout]
eggs-directory = /home/mg/tmp/buildout-eggs
# XXX replace /home/mg with the full path of *your* home directory
# it would be much nicer if buildout let me use ~ or $HOME
# see https://bugs.launchpad.net/zc.buildout/+bug/190260

Another useful trick is to pass the -N flag to bin/buildout, which will tell it not to bother looking for newer versions of packages on the Internet when there's already an existing version installed in your eggs directory.

Back to business: now you've got two new scripts: bin/python and bin/paster. You can use the first one to play with the interactive Python console where you can now import pylons and all the dependencies; it has no other value.

Now is a good point to add the files you've created into a version control system. I'll arbitrarily use Bazaar.

$ bzr init .
$ bzr add bootstrap.py buildout.cfg
$ bzr ignore bin parts eggs develop-eggs .installed.cfg
$ bzr commit -m "Create AlliterationSharing project"

Run bin/paster create -t pylons to create a skeleton project.

$ bin/paster create -t pylons asharing
$ bzr ignore *.egg-info
$ bzr add asharing
$ bzr commit -m "Generated project files with paster create"

Now paster creates a directory structure that I don't like:

AlliterationSharing/
  buildout.cfg
  bin/
  asharing/
    setup.py
    README.txt
    MANIFEST.in
    asharing/
      __init__.py
      config/
      controllers/
      templates/
      public/

I'd like the README and setup.py to be in the top level, and I dislike repeating 'asharing' twice in directory names. I'll move some files around

$ cd asharing/
$ bzr mv development.ini docs MANIFEST.in README.txt setup.* test.ini ../
$ bzr rm ez_setup.*
$ cd ..
$ bzr mv asharing src
$ bzr ci -m "Moved some files around"

Now the tree looks like this:

AlliterationSharing/
  buildout.cfg
  setup.py
  README.txt
  MANIFEST.in
  bin/
  src/
    asharing/
      __init__.py
      config/
      controllers/
      templates/
      public/

We have to tell setup.py where to find the source tree

$ bzr diff
=== modified file 'MANIFEST.in'
--- MANIFEST.in	2009-09-13 13:04:00 +0000
+++ MANIFEST.in	2009-09-13 13:05:59 +0000
@@ -1,3 +1,3 @@
-include asharing/config/deployment.ini_tmpl
-recursive-include asharing/public *
-recursive-include asharing/templates *
+include src/asharing/config/deployment.ini_tmpl
+recursive-include src/asharing/public *
+recursive-include src/asharing/templates *

=== modified file 'setup.py'
--- setup.py	2009-09-13 13:04:00 +0000
+++ setup.py	2009-09-13 13:04:40 +0000
@@ -17,7 +17,8 @@
         "SQLAlchemy>=0.5",
     ],
     setup_requires=["PasteScript>=1.6.3"],
-    packages=find_packages(exclude=['ez_setup']),
+    packages=find_packages('src', exclude=['ez_setup']),
+    package_dir={'': 'src'},
     include_package_data=True,
     test_suite='nose.collector',
     package_data={'asharing': ['i18n/*/LC_MESSAGES/*.mo']},

(I'm not sure if you also need to change package_data and/or setup.cfg; it's possible that I left i18n in a broken state. Can somebody comment on this?)

And we have to tell buildout that we've got a new Python package to enable in the project environment

$ bzr diff buildout.cfg 
=== modified file 'buildout.cfg'
--- buildout.cfg	2009-09-13 12:57:21 +0000
+++ buildout.cfg	2009-09-13 13:08:05 +0000
@@ -1,8 +1,10 @@
 [buildout]
+develop = .
 parts = pylons
 
 [pylons]
 recipe = zc.recipe.egg
 eggs = Pylons
        PasteScript
+       asharing
 interpreter = python

Now you can re-run bin/buildout and start your hello-world project

$ bzr commit -m "Include the new package in the build"
$ bin/buildout -N
$ bin/paster serve --reload development.ini

Happy hacking!

To be continued: telling buildbot to create bin/test; using ctags and omelette.

posted at 16:13 | tags: , , | permanent link to this entry | 17 comments

Mon, 03 Aug 2009

Local changes to buildout.cfg

Most of Python packages in the Zope world use Buildout:

svn co svn+ssh://svn.zope.org/repos/main/plone.z3cform/trunk plone.z3cform
cd plone.z3cform
python2.4 bootstrap.py
bin/buildout
bin/test -pvc

Now suppose you want to change the buildout environment somehow, e.g. use the current development version of zope.testing instead of whatever is specified in buildout.cfg. Don't edit the existing buildout.cfg (you might accidentally commit your local debug changes), instead create a new cfg file, e.g. test.cfg:

[buildout]
extends = buildout.cfg
develop += ../zope.testing

[versions]
# override any existing version pins
zope.testing =

Now re-run buildout

bin/buildout -c test.cfg
bin/test -pvc

And the tests should be run with the newest zope.testing.code.

Only this does not work with plone.z3cform, and I have no clue why. It generally works with other packages (at least those that use the zc.recipe.testrunner rather than collective.recipe.z2testrunner). Buildout is like that sometimes :(

posted at 20:26 | tags: | permanent link to this entry | 0 comments

Sat, 25 Jul 2009

Python-related updates for the last couple of months

Went to EuroPython, met new people, had a great time.

Updated gtkeggdeps, the interactive Python package dependency browser. Collaborated with Thomas Lotze, who maintains the engine (tl.eggdeps) that gtkeggdeps wraps, to resolve API mismatches. Moved the sources to launchpad.net, added a test suite, made it use zc.buildout for convenient development.

Moved the source repository of gtimelog, the simple desktop time tracker, to launchpad.net. Failed to do anything else with it. :-(

Tried to work on xdot, wrestled with git-svn merges, failed abysmally. Asked upstream to upload xdot to PyPI.

Released ZODB Browser, but this deserves a separate post.

Sent a bunch of pyflakes patches from my old branch upstream, created trac tickets for the rest. Wrestled with bzr-svn merges, failed abysmally.

posted at 02:14 | tags: | permanent link to this entry | 0 comments

Thu, 21 May 2009

Surprising old-style class behaviour

Some anonymous Planet Python poster (at least I couldn't find the author's name on the blog) Christian Wyglendowski asks about a surprising difference between old-style and new-style classes. Since the comments on their blog are closed (which you find out only after pressing Submit), I'll answer here.

The question, slightly paraphrased: given a class

class LameContainerOld:
    def __init__(self):
        self._items = {'bar':'test'}
 
    def __getitem__(self, name):
        return self._items[name]
 
    def __getattr__(self, attr):
        return getattr(self._items, attr)

why does the 'in' operator work

>>> container = LameContainerOld()
>>> 'foo' in container
False
>>> 'bar' in container
True

when the equivalent new-style class raises a KeyError: 0 exception? Also, why does __getattr__ appear to be called to get the bound __getitem__ method of the dict?

>>> container.__getitem__
<bound method LameContainerNew.__getitem__ of {'bar': 'test'}>

What actually happens here is that LameOldContainer.__getattr__ gets called for special methods such as __contains__ and __repr__. This is why (1) the 'in' check works, and (2) it appears, at first glance, that you get the wrong __getitem__ bound method. If you pay close attention to the output, you'll see that it's the __getitem__ of LameOldContainer; it's just that repr(LameOldContainer()) gets proxied through to the dict.__repr__ when you don't expect it:

>>> container
{'bar': 'test'}

Special methods never go through __getattr__ for new-style classes, therefore neither __contains__ nor __repr__ are proxied if you make the container inherit object. If there's no __contains__ method, Python falls back to the sequence protocol and starts calling __getitem__ for numbers 0 through infinity, or until it gets an IndexError exception.

posted at 22:12 | tags: | permanent link to this entry | 2 comments