Random notes from mg

Sun, Apr 4, 2010

Review: Grok 1.0 Web Development

Disclaimer: I received a free review copy of this book. The book links are affiliate links; I get a small amount from any purchase you make through them.

Grok is a Python web framework, built on top of the Zope Toolkit, which is the core of what used to be called Zope 3 and is now rebranded as BlueBream. Confused yet? Get used to it: the small pluggable components are the heart and soul of ZTK, and the source of its flexibility. It's not surprising that people take the same approach on a larger scale: take Zope 3 apart into smaller packages and reassemble them into different frameworks such as Grok, BlueBream or repoze.bfg.

The Grok book by Carlos de la Guardia introduces the framework by demonstrating how to create a small but realistic To-do list manager. I like this technique, and it works pretty well. The author covers many topics:

creation of a new project
simple views with Zope Page Templates
automatic form generation from schemas (with tweaks)
catalogs and indexes (my favourite chapter)
security: users, roles, permissions; authentication and authorization
extremely pluggable page layouts with viewlets and pagelets
basic ZODB, blobs, ZEO, database packing, backups with repozo
SQL databases, integration with SQLAlchemy (including a common transactional model)
component architecture: adapters and utilities
Martian: extending Grok by defining custom component directives
very short intro to testing (zope.testing, unit tests and doctests, functional tests with zope.testbrowsing) and debugging (pdb; AJAXy debugger, which looks exactly like the Pylons one with an uglier skin)
deployment (my second favourite chapter): paster, apache and mod_proxy, mod_wsgi, pound, squid, varnish, scalable deployments.

Some important topics like internationalization, time zones, testing with Selenium, and (especially) database migration (which is pretty specific for ZODB) were not covered.

If you want to learn about Grok, this book will be useful, but there's a caveat: there's the usual slew of typographical mistakes and other errors I've come to expect from books published by Packt. It's their third book I've seen; all three had surprisingly high numbers of errors. Some had more, others had fewer. The Grok book was on the high side and the first one where I was tempted to record a "WTFs per page" metric. The mistakes are easy to notice and correct, so they didn't impede my understanding of the book's content. Disclaimer: I've been working with Zope 3 for the last six-or-so years, so I was pretty familiar with the underlying technologies, just not the thin Grok convenience layer. If minor errors annoy you, stay away. I haven't noticed any major factual errors, although there were what I would consider some pretty important omissions:

ZODB is not as transparent as people tell you. There are many gotchas, especially if you want to refactor your code without throwing away old databases.
bin/buildout is free to recursively remove anything under parts. Keeping your database there is fine only if you don't mind occasionally starting from scratch.
repozo does not back up blobs.
The ZODB transaction conflict resolution depends on being able to repeat requests several times; this is important if your code has external side effects (e.g. sends emails, creates files, pings 3rd party websites over HTTP). Packages like megrok.rdb or zope.sendmail take care of this; it'd be nice to be shown how to do that for your own code before you discover this issue the hard way when your app starts charging people's credit cards three times every now and then.
You need to make sure you send out object events at appropriate times, or your catalog indexes won't be updated.
Permission and role grants are persistent: if you delete a user and then create a new one with the same username, the new user will have all the roles and permissions granted to the old one. If you implement user deletion, you need to explicitly remove old grants.
The Zope security model expects every object to have a valid __parent__ attribute; permission/role grants will not work properly on objects without a __parent__. Most of the time this is taken care of automatically, but when it's not, you can get really confusing errors.
applySkin should only be used for browser requests; blindly calling it from a traversal event handler can break WebDAV/XML-RPC. (Incidentally, I should file a bug about that; it should abort if you pass a non-browser request instead of silently converting it into a browser request.)
Allowing end-users to specify ++skin++ in the URL can be a security hole.

Overall, Grok is pretty nice, especially compared to vanilla Zope 3. However, when compared to frameworks like Pylons or Django, Grok appears more complex and seemingly requires you to do additional work for unclear gain. For example, chapter 8 has you writing three components for every new form you add: one for the form itself, one for a pagelet wrapping the form, and one for a page containing the pagelet. Most of that code is very similar with only the names being different. I'm sure there are situations where this kind of extreme componentization pays off (e.g. it lets you override particular bits on particular pages to satisfy a particular client's requests, without affecting any other clients), but the book doesn't convincingly demonstrate those advantages. Again, I may be biased here since I've been enjoying those advantages for the past six years, without ever having felt the pain of doing similar customizations with a less flexible framework. (It's a gap in my professional experience that I'm itching to fill.)

Update: some other reviews on Planet Python.

Update 2: Another review (well, part 1 of one, but I got tired waiting for part 2).

Permalink

Sat, Mar 13, 2010

Review: Python Testing: Beginner's Guide

I've been testing (as well as writing) Python code for the last eight years, so a book with the words Begginer's Guide prominently displayed on the cover isn't something I'd've decided to buy for myself. Nevertheless I jumped at the offer of receiving a free e-copy for reviewing it.

Short summary: it's good book. I learned a thing or two from it. I don't know well it would work as an introductionary text for someone new to unit testing (or Python). Some of the bits seemed overcomplicated and underexplained, parts of the example code/tests seemed to contain design decisions received from mysterious sources.

Incidentally, Packt uses a simple yet effective method for watermarking e-books: my name and street address are displayed in the footer of every page. What's funny is that the two non-ASCII characters in the street name are replaced with question marks. It's not a data entry problem: the website that let me download those books shows my address correctly, so it must be happening somewhere in the PDF production process. I didn't expect this kind of Unicode buggyness from a publisher. Then again there were occasional strange little typographical errors in the text, like not leaving a space in front of an opening parenthesis in an English sentence, or using a never-seen-before +q= operator in Python code. I was also left wondering how the following sentence (page 225) could slip past the editing process:

doctest ignores everything between the Traceback (most recent last call).

Thankfully those small mistakes did not detract from the overall message of the book.

I liked the author's technique of showing subtly incorrect code, letting the reader look at it and miss all the bugs, and then showing how unit or integration tests catch the bugs the reader missed. I'm pretty sure there's at least one remaining bug that the author missed in the example package (storing a schedule doesn't erase old data), which could serve for a new chapter on regression testing if there's a second edition.

Summary of topics covered:

Terms: unit testing, integration testing, system testing.
Basics of doctest and unittest, their strengths and weaknesses.
Using mocks (with Mocker).
Using Nose.
Test-Driven Development with lots of example code.
Using Twill.
Integration testing with lots of example code.
Using coverage
Post-commit hooks to run tests with Bazaar, Mercurial, Git, Darcs, Subversion.
Continuous integration with Buildbot

I found the TDD cycle a bit larger than I generally like, but I believe it's a matter of taste, and perhaps a shorter cycle wouldn't work as well in a written medium.

I found it a bit jarring how the Twill chapter intrudes between the two chapters showing unit testing and integration testing of the same sample package. I think it would've been better to swap the order of chapters 8 and 9.

I liked the technique presented for picking subsets of the code for integration tests, although I wonder how well it would work on a larger project.

Topics not covered:

Functional testing (which is very close but not exactly the same as system testing).
Regression testing (page 46 contains advice about this without mentioning the term regression testing).
Continuous integration with Hudson (simpler to set up than buildbot, easily covers 80% of cases).

As you can see these holes are all rather small.

Probably the biggest weakness of the book is the complexity of some things shown:

writing mocks for pure unit tests
mocking other instances of the same class under test
even occasionally mocking self, which needs tricks like calling a method's im_func directly
mocking __reduce_ex__ so you can pickle mocks in an integration test, instead of using real classes or simple stubs.
testing the same code multiple times: unit tests, several sets of integration tests that test ever-increasing subsets of classes
Buildbot instead of Hudson

Seeing the repetitive and redundant mock code in the first few doctest examples I started asking what's the point?, but the book failed to provide a compelling answer (the answer provided—it's easier to locate bugs—works just as well for integration tests that focus on individual classes). And there are good answers for that question, like instant feedback from your unit test suite. Are they worth the additional development effort? Maybe that depends on the developer. I don't think they would help me, so I tend to stick with low-level integration tests I call "unit tests" (as well as system tests; it's always a mistake to keep all your tests in a single level). I'm slightly worried that this book might give the wrong impression (testing is hard) and turn away beginning Python programmers from writing tests altogether.

Overall I do not feel that I have wasted my time reading Python Testing. I look forward to reading the other reviews that showed up on Planet Python. I gathered that not all reviewers were happy with the book, but avoided reading their reviews in order not to influence my own.

Update: I especially liked this review by Brian Jones. The lack of awkward page breaks in code examples is something that I only noticed after reading a different book, which is full of such awkward breaks, sigh.

Update 2: The book links are now affiliate links; I get a small amount from any purchase you make through them.

Permalink

Sat, Mar 6, 2010

You've got to love profiling

Yesterday I slashed 50% of run time from our applications functional test suite by modifying a single function. I had no idea that function was responsible for 50% of the run time until I started profiling.

Profiling a Python program is getting easier and easier:

$ python -m cProfile -o prof.data bin/test -f

runs our test runner (which is a Python script) under the profiler and stores the results in prof.data.

$ runsnake prof.data

launches the RunSnakeRun profile viewer, which displays the results visually:

The square map display of RunSnakeRun, with the 'render_restructured_text' function highlighted.

Who knew that ReStructuredText rendering could be such a time waster? A short caching decorator and the test suite is twice as fast. The whole exercise took me less than an hour. I should've done it sooner.

Other neat tools:

pstats from the standard library lets you load and display profiler results from the command line (try python -m pstats prof.data).
pyprof2calltree converts Python profiler data files to a format that the popular profiler visualization tool kcachegrind can understand. It's somewhat less useful now that RunSnakeRun exists.
profilehooks by yours truly has decorators for easily profiling individual functions instead of entire scripts.
keas.profile and repoze.profile hook up the profiler as WSGI middleware for easy profiling of web apps.

Permalink

Fri, Mar 5, 2010

Bye, bye, free time!

Things I've taken up to do in the nearest future:

Read and review Python Testing: Beginner's Guide and Grok 1.0 Web Development for Packt. (The links are trackable to my blog, but I'm not getting anything out of it. Other than free copies of the e-books, which I already received, in exchange for a promise to review them on this blog.)
Help Reportlab folks set up continuous integration (most likely Hudson, since Buildbot, while powerful, has a steep learning curve).
Think about becoming the buildbotmaster for Zope. Originally I intended to volunteer to set up a few buildbots for various Zopeish projects (ZTK, BlueBream, Grok, Zope 2) since half of the existing ones were down or broken. Then various people fixed some of the broken ones and other people chimed in mentioning existing buildbots that nobody else knew about. There is a need for somebody to coordinate all this activity: make sure we have up-to-date test results for all kinds of projects, aggregate them in one place, chase up build slaves for exotic OSes (i.e. Windows)... I don't think I'm well suited for this kind of organisational activity.
Push along the various scratch-my-itch open source projects (GTimeLog, irclog2html, zodbbrowser).
No idea what, but I've been wanting to do something for Maemo. Something small, given the copious amounts of free time I have.
Then there's the paying work. On the plus side, there are opportunities for fun there (today I slashed functional test run time by a half, by adding a small caching decorator in front of a single function. RunSnakeRun and cProfile rule!)
You know what, scratch the Zope buildbotmaster idea. Maybe I can do something technical there, e.g. a cron script to ping the various buildbot, scrape HTML/parse emails and aggregate build results. Maybe.
I hope I don't get burnout again. Because that would suck. Again. Been there, done that, didn't even get a T-shirt.

I really ought to read Getting Things Done. Reading it has been on my todo-list for years.

Permalink

Wed, Mar 3, 2010

Oopsie

Sorry for flooding Planet Maemo -- it was a side effect of changing this feed's URL to only include posts tagged "maemo". I'm not sure if the fault is PyBlosxom's or the aggregator's

As a penance, here's a Terminal trick for you:

LABELS='[Tab,Esc,Enter,PgUp,PgDn,F2,VKB]'
KEYS='[Tab,Escape,KP_Enter,Page_Up,Page_Down,F2,Return]'
gconftool -s /apps/osso/xterm/key_labels --type list --list-type string "$LABELS"
gconftool -s /apps/osso/xterm/keys --type list --list-type string "$KEYS"

This changes the toolbar to have three extra keys (Enter, F2, and a key that acts like Enter when the hardware keyboard is open, and opens the virtual keyboard if the hardware keyboard is closed).

Update: added screenshot:

Nokia N900 Terminal app with new toolbar buttons

Permalink