Random notes from mg
Thu, 12 Apr 2012
I remember when the logging package seemed big and complicated and
forbidding. And then I remember when I finally "got" it, started using it,
even liked it. And then I've discovered that I didn't really understand
the model after all.
Consider this: we have two loggers
root
\
-- mylogger
configured as follows:
import logging
root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(logging.FileHandler("info.log"))
mylogger = logging.getLogger('mylogger')
mylogger.setLevel(logging.DEBUG)
mylogger.addHandler(logging.FileHandler("debug.log"))
What happens when I do mylogger.debug('Hi')?
Answer: the message appears in both debug.log and info.log.
That was surprising to me. I'd always thought that a logger's level was
a gatekeeper to all of that logger's handlers. In other words, I always
thought that when a message was propagating from a logger to its parent (here
from mylogger to root), it was also being filtered against the parent's log
level. That turns out not to be the case.
What actually happens is that a message is tested against the level of
the logger where it was initially logged, and if it passes the check, it
gets passed to all the handlers of that logger and all its ancestors with no
further checks. Unless the propagation is stopped somewhere by one of the
loggers having propagate set to False. And of course each handler has its own
level filtering. And I'm ignoring filters
and the global level override.
Part of the confusion was caused by my misunderstanding of
logging.NOTSET. I assumed, incorrectly, that it was just a
regular logging level, one even less severe than DEBUG. So when I
wrote code like this:
import logging
root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(logging.FileHandler("info.log"))
mylogger = logging.getLogger('mylogger')
mylogger.setLevel(logging.NOTSET)
mylogger.debug("debug message")
I saw the debug message being suppressed and assumed it was because of the
root logger's level. Which is correct, in a way, just not the way I thought
about it.
NOTSET does not mean "pass all messages through", it
means "inherit the log level from the parent logger". The documentation
actually describes
this, although in a rather convoluted way. My own fault for misunderstanding
it, I guess.
Fri, 09 Mar 2012
If you use
logging.config.fileConfig
(e.g. because you use paster serve something.ini to deploy your WSGI
apps) you should know about this.
By default fileConfig disables all pre-existing loggers
if they (or their parent loggers) are not explicitly mentioned in your
.ini file.
This can result in unintuitive behaviour:
(if you don't see the embedded example, you can find it at https://gist.github.com/1642893).
If you have Python 2.6 or later (and you should), you can turn this off
by passing disable_existing_loggers=False to fileConfig().
But what if it's not you calling fileConfig() but your framework (e.g.
the above-mentioned paster serve)?
Now usually paster serve tries to configure logging before
importing any of your application modules, so there should be no pre-existing
loggers to disable. Sometimes, however, this doesn't work for one reason or
another, and you end up with your production server suppressing warnings and
errors that should not be suppressed. I haven't actually figured out yet who's
responsible for those early imports in the application I'm working on (until
today I assumed, incorrectly, that paster imports the module containing your
WSGI app before it calls fileConfig).
If you're not sure if this bug can bite you or not, check that you don't
have any disabled loggers by doing something like
import logging
assert not any(getattr(logger, 'disabled', False)
for logger in logging.getLogger().manager.loggerDict.values())
while your application is running.
Sun, 19 Dec 2010
objgraph got
zodbbrowser got
- support for all ZODB databases, not just those with a
Zope 3/Bluebream-style root folder/local site.
- the ability to cope better with broken objects (due to the way ZODB
works, not having some of your modules on the Python pack can break
unpickling; zodbbrowser now handles this kind of situation better).
- assorted smaller improvements.
- a slow but inevitable shift of focus from "use it as a plugin for your
Zope 3 app" to "it's a standalone tool for inspecting ZODB contents".
(Both use cases are still supported, and will be for the foreseeable
future.)
imgdiff got
- its first public release.
- some experimental code to actually find and highlight the differing
parts of the images:
This works better when both images are the same size, although there's
experimental (and somewhat buggy) code to try all possible alignments.
I could use some help here; image processing is not something I'm
familiar with, and searching
StackOverflow didn't help beyond reminding me of the existence
of PIL's ImageChops.difference(), which is for same-sized images only.
Many of the results there are about comparing photos, where things like
lighting levels matter. What I need is a diff for computer-generated
images, where some things may be shifted around a bit, by different
amounts, but are essentially the same. Are there any two-dimensional
sequence diff algorithms?
Sat, 07 Aug 2010
Dozer is mostly known for
its memory profiling capabilities, but the as-yet unreleased version has
more. I've talked
about log capturing, now it's time for
Profiling
This WSGI middleware profiles every request with the cProfile module.
To see the profiles, visit a hidden URL /_profiler/showall:

What you see here is heavily tweaked in my fork branch
of Dozer; upstream version had no Cost column and
didn't vary the background of Time by age (that
last bit helps me see clumps of requests).
Here's what an individual profile looks like:

The call tree nodes can be expanded and collapsed by clicking on the
function name. There's a hardcoded limit of 20 nesting levels (upstream had a
limit of 15), sadly that appears not to be enough for practical purposes,
especially if you start profiling Zope 3 applications...
You can also take a look at the WSGI environment:

Sadly, nothing about the response is captured by Dozer. I'd've liked to
show the Content-Type and perhaps Content-Length in the profile list.
The incantation in development.ini is
[filter-app:profile]
use = egg:Dozer#profile
profile_path = /tmp/profiles
next = main
Create an empty directory /tmp/profiles and make sure other users
cannot write to it. Dozer stores captured profiles as Python
pickles, which are insecure
and allow arbitrary
command execution.
To enable the profiler, run paster like this:
$ paster serve development.ini -n profile
Bonus feature: call graphs
Dozer also writes a call graph in Graphviz "dot" format in the profile
directory. Here's the graph corresponding to the profile you saw earlier,
as displayed by the excellent XDot:

See the fork where the "hot" red path splits into two?

On the left we have Routes deciding to spend 120 ms (70% total time)
recompiling its route maps. On the right we have the actual request dispatch.
The actual controller action is called a bit further down:

Here it is, highlighted. 42 ms (24% total time), almost all of which is
spent in SQLAlchemy, loading the model object (a 2515 byte image stored as a
blob) from SQLite.
A mystery: pickle errors
When I first tried to play with the Dozer profiler, I was attacked by
innumerable exceptions. Some of those were due to a lack of configuration
(profile_path) or invalid configuration (directory not existing), or not
knowing the right URL (going to /_profiler raised TypeError). I
tried to make Dozer's profiler more forgiving or at least produce clearer
error messages in my fork branch,
e.g. going to /_profiler now displays the profile list.
However some errors were very mysterious: some pickles, written by Dozer
itself, could not be unpickled. I added a try/except that put those at the end
of the list, so you can see and delete them.

Does anybody have any clues as to why profile.py
might be writing out broken pickles?
Update: as Ben says in the comments, my changes have
been accepted upstream. Yay!
Dozer is mostly known for
its memory profiling capabilities, but the as-yet unreleased version has
more:
Log capturing
This WSGI middleware intercepts logging calls for every request. Here
we see a toy Pylons application I've
been working on in my spare time. Dozer added an info bar at the top:

When you click on it, you get to see all the log messages produced for this
request. I've set SQLAlchemy's loglevel to INFO in my
development.ini, which produces:

(Why on Earth does SQLAlchemy think I want to see the memory address of the
Engine object in my log files, I don't know. The parentheses contain
argument values for parametrized queries, of which there are none on this
page.)
Upstream version displays absolute timestamps (of the YYYY-MM-DD
HH:MM:SS.ssssss variety) in the first column; my fork shows deltas in
milliseconds. The incantation in development.ini is
[filter-app:logview]
use = egg:Dozer#logview
next = main
which makes it disabled by default. To enable, you run paster like this:
$ paster serve development.ini -n logview
(Upstream version lacks the paste entry point for logview; it's in my
fork, for which I submitted a pull request weeks ago like a good open-source
citizen. Incidentally, patches for stuff I maintain have been known to
languish for years in my inbox, so I'm not one to throw
stones.)
Next: profiling with Dozer.
Update: Tom Longson blogged
about this back in 2008! And his CSS is prettier.
Fri, 06 Aug 2010
irclog2html, the IRC log to HTML
converter, is now (finally!) available from the Python Package Index.
In other news, logs2html now copies irclog.css to the
destination directory (if it doesn't exist there already). I've been noticing
logs produced with irclog2html on random places, and sometimes they were
unstyled; hopefully this will become rare now.
Wed, 07 Jul 2010
If you're using virtualenv, and after a system upgrade you get errors like
...
File "...", line ...
from hashlib import md5
File "/usr/lib/python2.6/hashlib.py", line 63, in __get_builtin_constructor
import _md5
ImportError: No module named _md5
this means that the copy of the python executable in your virtualenv/bin
directory is outdated and you should update it:
$ cp /usr/bin/python2.6 /path/to/venv/bin/python
or, better yet, recreate the virtualenv.
Sun, 18 Apr 2010
Martijn Faassen defends web
frameworks in a rather longish post (you can tell it's 5 AM in the
morning and I've nearly defeated the unread post queue in Google
Reader). I'd like to propose a condensed
version. Consider this slogan:
Simple things should be easy; complicated things should be possible.
Frameworks make simple things easy. Good frameworks
keep the complicated thing possible; poorly-designed frameworks make the
complicated thing more difficult than necessary; bad frameworks make even
simple things complicated.
Doing everything from scratch merely makes things possible, but rarely
easy.
Sun, 04 Apr 2010
Disclaimer: I received a free review copy of this book. The book links are
affiliate links; I get a small amount from any purchase you make through them.
Grok is a Python web framework, built on
top of the Zope Toolkit, which is the core of what used to be called Zope 3 and
is now rebranded as BlueBream. Confused yet? Get used to it: the small
pluggable components are the heart and soul of ZTK, and the source of its
flexibility. It's not surprising that people take the same approach on a larger
scale: take Zope 3 apart into smaller packages and reassemble them into
different frameworks such as Grok, BlueBream or repoze.bfg.
The Grok
book by Carlos de la Guardia introduces the framework by demonstrating
how to create a small but realistic To-do list manager. I like this technique,
and it works pretty well. The author covers many topics:
- creation of a new project
- simple views with Zope Page Templates
- automatic form generation from schemas (with tweaks)
- catalogs and indexes (my favourite chapter)
- security: users, roles, permissions; authentication and authorization
- extremely pluggable page layouts with viewlets and pagelets
- basic ZODB, blobs, ZEO, database packing, backups with repozo
- SQL databases, integration with SQLAlchemy (including a common
transactional model)
- component architecture: adapters and utilities
- Martian: extending Grok by defining custom component directives
- very short intro to testing (zope.testing, unit tests and doctests,
functional tests with zope.testbrowsing) and debugging (pdb; AJAXy
debugger, which looks exactly like the Pylons one with an uglier skin)
- deployment (my second favourite chapter): paster, apache and mod_proxy,
mod_wsgi, pound, squid, varnish, scalable deployments.
Some important topics like internationalization, time zones, testing with
Selenium, and (especially) database migration (which is pretty specific for
ZODB) were not covered.
If you want to learn about Grok, this book will be useful,
but there's a caveat: there's the usual slew of typographical mistakes and
other errors I've come to expect from books published by Packt. It's their
third book I've seen; all three had surprisingly high numbers of errors. Some
had more, others had fewer. The Grok book was on the high side and the first
one where I was tempted to record a "WTFs per page" metric.
The mistakes are easy to notice and correct, so they didn't impede my
understanding of the book's content. Disclaimer: I've been working with
Zope 3 for the last six-or-so years, so I was pretty familiar with the
underlying technologies, just not the thin Grok convenience layer. If
minor errors annoy you, stay away. I haven't noticed any major
factual errors, although there were what I would consider some pretty important
omissions:
- ZODB is not as transparent as people tell you. There are many gotchas,
especially if you want to refactor your code without throwing away old
databases.
- bin/buildout is free to recursively remove anything under
parts. Keeping your database there is fine only if you don't mind
occasionally starting from scratch.
- repozo does not back up blobs.
- The ZODB transaction conflict resolution depends on being able to
repeat requests several times; this is important if your code has external
side effects (e.g. sends emails, creates files, pings 3rd party websites over
HTTP). Packages like megrok.rdb or zope.sendmail take care of this; it'd be
nice to be shown how to do that for your own code before you discover this
issue the hard way when your app starts charging people's credit cards three
times every now and then.
- You need to make sure you send out object events at appropriate times, or
your catalog indexes won't be updated.
- Permission and role grants are persistent: if you delete a user and then
create a new one with the same username, the new user will have all the roles
and permissions granted to the old one. If you implement user deletion, you
need to explicitly remove old grants.
- The Zope security model expects every object to have a valid
__parent__
attribute; permission/role grants will not work properly on objects without a
__parent__. Most of the time this is taken care of
automatically, but when it's not, you can get really confusing errors.
applySkin should only be used for browser requests; blindly
calling it from a traversal event handler can break WebDAV/XML-RPC.
(Incidentally, I should file a bug about that; it should abort if you pass a
non-browser request instead of silently converting it into a browser
request.)
- Allowing end-users to specify
++skin++ in the URL can be a
security hole.
Overall, Grok is pretty nice, especially compared to vanilla Zope 3.
However, when compared to frameworks like Pylons or Django, Grok appears more
complex and seemingly requires you to do additional work for unclear gain. For
example, chapter 8 has you writing three components for every new form you add:
one for the form itself, one for a pagelet wrapping the form, and one for a
page containing the pagelet. Most of that code is very similar with only the
names being different. I'm sure there are situations where this kind of
extreme componentization pays off (e.g. it lets you override particular bits on
particular pages to satisfy a particular client's requests, without affecting
any other clients), but the book doesn't convincingly demonstrate those
advantages. Again, I may be biased here since I've been enjoying those
advantages for the past six years, without ever having felt the pain of doing
similar customizations with a less flexible framework. (It's a gap in my
professional experience that I'm itching to fill.)
Update: some other
reviews
on Planet Python.
Update 2: Another
review (well, part 1 of one, but I got tired waiting for part 2).
Sat, 13 Mar 2010
I've been testing (as well as writing) Python code for the last eight years,
so a book with the words Begginer's Guide prominently displayed on
the cover isn't something I'd've decided to buy for myself. Nevertheless
I jumped at the offer of receiving a free e-copy for reviewing it.
Short summary: it's good book. I learned a thing or two
from it. I don't know well it would work as an introductionary text for
someone new to unit testing (or Python). Some of the bits seemed
overcomplicated and underexplained, parts of the example code/tests seemed to
contain design decisions received from mysterious sources.
Incidentally, Packt uses a simple yet
effective method for watermarking e-books: my name and street address are
displayed in the footer of every page. What's funny is that the two non-ASCII
characters in the street name are replaced with question marks. It's not a
data entry problem: the website that let me download those books shows my
address correctly, so it must be happening somewhere in the PDF production
process. I didn't expect this kind of Unicode buggyness from a publisher.
Then again there were occasional strange little typographical errors in the
text, like not leaving a space in front of an opening parenthesis in an English
sentence, or using a never-seen-before +q= operator in Python code. I
was also left wondering how the following sentence (page 225) could slip past
the editing process:
doctest ignores everything between the Traceback (most recent last call).
Thankfully those small mistakes did not detract from the overall message of
the book.
I liked the author's technique of showing subtly incorrect code, letting the
reader look at it and miss all the bugs, and then showing how unit or
integration tests catch the bugs the reader missed. I'm pretty sure there's at
least one remaining bug that the author missed in the example package (storing
a schedule doesn't erase old data), which could serve for a new chapter on
regression testing if there's a second edition.
Summary of topics covered:
- Terms: unit testing, integration testing, system testing.
- Basics of doctest and unittest, their strengths and weaknesses.
- Using mocks (with Mocker).
- Using Nose.
- Test-Driven Development with lots of example code.
- Using Twill.
- Integration testing with lots of example code.
- Using coverage
- Post-commit hooks to run tests with Bazaar, Mercurial, Git, Darcs,
Subversion.
- Continuous integration with Buildbot
I found the TDD cycle a bit larger than I generally like, but I believe it's
a matter of taste, and perhaps a shorter cycle wouldn't work as well in a
written medium.
I found it a bit jarring how the Twill chapter intrudes between the two
chapters showing unit testing and integration testing of the same sample
package. I think it would've been better to swap the order of chapters 8 and
9.
I liked the technique presented for picking subsets of the code for
integration tests, although I wonder how well it would work on a larger
project.
Topics not covered:
- Functional testing (which is very close but not exactly the same as
system testing).
- Regression testing (page 46 contains advice about this without mentioning
the term regression testing).
- Continuous integration with Hudson (simpler to set up than buildbot,
easily covers 80% of cases).
As you can see these holes are all rather small.
Probably the biggest weakness of the book is the complexity of some
things shown:
- writing mocks for pure unit tests
- mocking other instances of the same class under test
- even occasionally mocking self, which needs tricks like
calling a method's im_func directly
- mocking __reduce_ex__ so you can pickle mocks in an
integration test, instead of using real classes or simple
stubs.
- testing the same code multiple times: unit tests, several sets of
integration tests that test ever-increasing subsets of classes
- Buildbot instead of
Hudson
Seeing the repetitive and redundant mock code in the first few doctest
examples I started asking what's the point?, but the book failed to
provide a compelling answer (the answer provided—it's easier to locate
bugs—works just as well for integration tests that focus on individual
classes). And there are good answers for that question, like instant feedback
from your unit test suite. Are they worth the additional development effort?
Maybe that depends on the developer. I don't think they would help me, so I
tend to stick with low-level integration tests I call "unit tests" (as well as
system tests; it's always a mistake to keep all your tests in a single level).
I'm slightly worried that this book might give the wrong impression (testing is
hard) and turn away beginning Python programmers from writing tests
altogether.
Overall I do not feel that I have wasted my time reading Python
Testing. I look forward to reading the
other
reviews
that showed up on Planet Python. I gathered that not all reviewers were happy
with the book, but avoided reading their reviews in order not to influence my
own.
Update: I especially liked this
review by Brian Jones. The lack of awkward page breaks in code examples
is something that I only noticed after reading a different book, which is full
of such awkward breaks, sigh.
Update 2: The book links are now affiliate links; I get a small
amount from any purchase you make through them.
Sat, 06 Mar 2010
Yesterday I slashed 50% of run time from our applications functional test
suite by modifying a single function. I had no idea that function was
responsible for 50% of the run time until I started profiling.
Profiling a Python program is getting easier and easier:
$ python -m cProfile -o prof.data bin/test -f
runs our test runner (which is a Python script) under the profiler and stores
the results in prof.data.
$ runsnake prof.data
launches the RunSnakeRun
profile viewer, which displays the results visually:
The square map display of RunSnakeRun, with the 'render_restructured_text'
function highlighted.
Who knew that ReStructuredText rendering could be such a time waster? A
short caching decorator and the test suite is twice as fast. The whole
exercise took me less than an hour. I should've done it sooner.
Other neat tools:
- pstats
from the standard library lets you load and display profiler results from the
command line (try python -m pstats prof.data).
- pyprof2calltree
converts Python profiler data files to a format that the popular profiler
visualization tool kcachegrind
can understand. It's somewhat less useful now that RunSnakeRun exists.
- profilehooks by
yours truly has decorators for easily profiling individual functions instead
of entire scripts.
- keas.profile and repoze.profile hook
up the profiler as WSGI middleware for easy profiling of web apps.
Fri, 05 Mar 2010
Things I've taken up to do in the nearest future:
-
Read and review Python
Testing: Beginner's Guide and Grok
1.0 Web Development for Packt. (The links are trackable to my blog,
but I'm not getting anything out of it. Other than free copies of the
e-books, which I already received, in exchange for a promise to review them
on this blog.)
-
Help Reportlab
folks set up continuous integration (most likely Hudson, since Buildbot, while powerful, has a steep
learning curve).
-
Think about becoming the buildbotmaster for Zope. Originally I intended
to volunteer to set up a few buildbots for various Zopeish projects
(ZTK, BlueBream, Grok, Zope 2) since half of the existing
ones were down or broken. Then various people fixed some of the
broken ones and other people chimed in mentioning existing buildbots that
nobody else knew about. There is a need for somebody to coordinate all
this activity: make sure we have up-to-date test results for all kinds of
projects, aggregate them in one place, chase up build slaves for exotic
OSes (i.e. Windows)... I don't think I'm well suited for this kind of
organisational activity.
-
Push along the various scratch-my-itch open source projects (GTimeLog, irclog2html,
zodbbrowser).
-
No idea what, but I've been wanting to do something for Maemo. Something small, given the copious
amounts of free time I have.
-
Then there's the paying work. On the plus side, there are opportunities
for fun there (today I slashed functional test run time by a half, by
adding a small caching decorator in front of a single function.
RunSnakeRun
and cProfile
rule!)
-
You know what, scratch the Zope buildbotmaster idea. Maybe I can do
something technical there, e.g. a cron script to ping the various buildbot,
scrape HTML/parse emails and aggregate build results. Maybe.
-
I hope I don't get burnout
again. Because that would suck. Again. Been there, done that, didn't
even get a T-shirt.
I really ought to read Getting Things Done. Reading it has been on my
todo-list for years.
Wed, 03 Mar 2010
On Tuesday we started what will hopefully become a tradition: weekly IRC
meetings for Zope developers. Topics covered include buildbot organization and
maintenance, open issues with the ZTK development process, and the fate of Zope
3.5 (= BlueBream 1.0).
There are IRC logs of the meeting, and Christian Theune posted a summary
to the mailing list.
My take on this can be summed up as: Zope ain't dead yet! The project has
fragmented a bit (Zope 2, Zope Toolkit, Grok, BlueBream, Repoze), but we all
share a set of core packages and we want to keep them healthy.
Next meeting is also happening on a Tuesday, at 15:00 UTC on #zope in
FreeNode.
Thu, 07 Jan 2010
Michael Foord wrote about some
Latin-1 control character fun in a blog that's hard to read (the RSS feed
syndicated on Planet Python is truncated, grr!) and hard to reply (no comments
on the blog! my Chromium's AdBlock+ hid the comment link so I couldn't
find it), but never mind that.
Unfortunately the data from the customers included some \x85 characters,
which were breaking the CSV parsing.
0x85 is a control character (NEXT LINE or NEL) in Latin-1, but it's a
printable character (HORIZONTAL ELLIPSIS) in Microsoft's code page 1252, which
is often mistaken for Latin-1. I would venture a suggestion that the encoding
of the customer data was not latin-1 but rather cp1252.
>>> '\x85'.decode('cp1252')
u'\u2026'
Fri, 18 Dec 2009
Back in 2004 I wrote a small Gtk+ app to help me keep track of my time, and
called it GTimeLog. I shared it with
my coworkers, put it on the web (on the general "release early, release often"
principles), and it got sort-of popular before I found the time to polish it
into a state where I wouldn't be ashamed to show it to other people.
Fast-forward to 2008: there are actual users out there (much to my
surprise), I still haven't added the originally-envisioned spit and polish,
haven't done anything to foster a development community, am wracked by guilt of
not doing my maintainerly duties properly, which leads to depression and
burnout. So I do the only thing I can think of: run away from the project and
basically ignore its existence for a year. Unreviewed patches accumulate in my
inbox.
It seems that the sabbatical helped: yesterday, triggered by a new Debian bug report, I sat down,
fixed the bug, implemented a feature, applied a
couple of patches
languishing in the bug tracker, and released version 0.3 (which
was totally broken thanks to setuptools magic that suddenly stopped
working; so released 0.3.1 just now). Then went through my old unread email,
created bugs in Launchpad and sent
replies to everyone. Except Pierre-Luc
Beaudoin, since his @collabora.co.uk email address bounced. If anyone
knows how to contact him, I'd appreciate a note.

There are also some older changes that I made before I emerged out of the
funk and so hadn't widely announced:
-
There's a mailing
list for user and developer discussions (if there still are any ;).
-
GTimeLog's source code
now lives on Launchpad (actually, I mentioned this on my
blog once).
Wed, 09 Dec 2009
Unix is an IDE. I do my
development (Python web apps mostly) with Vim
with a bunch of custom plugins, shell
(in GNOME Terminal: tabs rule!), GNU make, ctags, find + grep,
svn/bzr/hg/git.
The current working directory is my project configuration/state. I run
tests here (bin/test), I search for code here (vim -t TagName, find + grep), I
run applications here (make run or bin/appname). I can multitask
freely, for example, if I'm in the middle of typing an SVN commit message, I
can hit Ctrl+Shift+T, get a new terminal tab in the same working directory, and
look something up. No aliases/environment variables/symlinks. I can work on multiple projects at the
same time. I can work remotely (over ssh).
Gary Bernhardt's screencasts on
Vimeo show how productive you can get if you learn Vim and tailor it
to your needs. I have Vim scripts that let me
-
See the name of the class and function that I'm editing in the statusbar,
even if the class/function definition is offscreen:
pythonhelper.vim.
-
See all pyflakes warnings and errors in a list as soon as I press F2 to
save the file: python_check_syntax.vim.
-
Add a "from foo.bar import Something" line at the top of the file if I
press F5 when my cursor is on Something, looking up the package and module
from ctags: python-imports.vim.
-
Switch between production code and unit tests with a single key if the
project uses one of several conventions for tests (e.g. ./foo.py
<-> ./tests/test_foo.py):
py-test-switcher.vim.
-
Generate a command line for running one particular unit test (the one
my cursor is inside) and copy it into the system clipboard, so I can
run that test by Alt-Tabbing into my terminal window and pasting.
py-test-runner.vim.
-
Open the right file and move the cursor to the right line if I
triple-click a line of traceback in a shell (or an email) then press F7 in
my gvim window:
py-test-locator.vim.
-
Compare my version of the code with the pristine version in source control
in an interactive side-by-side diff that lets me revert bits I no longer
want:
vcscommand.vim.
-
Highlight which lines of the source are covered by my tests, if I have
coverage information in trace.py format:
py-coverage-highlight.vim.
-
Show the signature of a function/class's __init__ when I type the name
of that class/function and an open parenthesis (looked up from tags):
py-function-signature.vim.
-
Fold code into an outline so I only see names of methods or classes
instead of their full bodies:
vimrc, function PythonFoldLevel.
-
Fold diff files so I can see whole hunks/files and can delete those with
a single key (well, two keys -- dd). Useful for reviewing large
diffs (tens of thousands of lines):
vimrc, function DiffFoldLevel.
Some of these come from www.vim.org, some
I've written myself, some I've taken and modified a little bit to avoid an
irritating quirk or add a missing feature. Some things I don't have (and envy
Emacs or IDE users for having -- like an integrated debugger for Python apps,
and, generally, integration with other tools, running in the background).
It's been my plan for a long time to polish my plugins, release them
somewhere (github? bitbucket? launchpad?) and upload to vim.org, but as it
doesn't seem to be happening, I thought I'd at least put an svn
export of my ~/.vim on the web.
Tue, 01 Dec 2009
zope.schema has Text and TextLine. The former is for multiline text, the
latter is for a single line, as the name suggests. Zope 3 forms will use a
text area for Text fields and an input box for TextLine fields. Display
widgets, however, apply no special formatting (other than HTML-quoting of
characters like <, > and &), and since newlines are treated the same
way as spaces in HTML, your multiline text gets collapsed into a single
paragraph.
Here's a pattern I've been using in Zope 3 to display multiline user-entered
text as several paragraphs:
import cgi
from zope.component import adapts
from zope.publisher.browser import BrowserView
from zope.publisher.interfaces import IRequest
class SplitToParagraphsView(BrowserView):
"""Splits a string into paragraphs via newlines."""
adapts(None, IRequest)
def paragraphs(self):
if self.context is None:
return []
return filter(None, [s.strip() for s in self.context.splitlines()])
def __call__(self):
return "".join('<p>%s</p>\n' % cgi.escape(p)
for p in self.paragraphs())
View registration
<configure
xmlns="http://namespaces.zope.org/zope">
<view
for="*"
name="paragraphs"
type="zope.publisher.interfaces.browser.IBrowserRequest"
factory=".views.SplitToParagraphsView"
permission="zope.Public"
/>
</configure>
and usage
<p tal:replace="structure object/attribute/@@paragraphs" />
Update: The view really ought to be registered twice: once
for basestring and once for NoneType. I was too lazy to figure out the dotted
names for those (or check if zope.interface has external interface declarations
for them), so I registered it for "*". You should know that this makes the
view available for arbitrary objects (but won't work for most of them, since
they don't have a splitlines method), and that it is, sadly, accessible to
users who may try to hack your system by typing things like @@paragraphs in the
browser's address bar. Ignas Mikalajūnas offers an alternative
solution using TALES path adapters.
Mon, 21 Sep 2009
I'm at the point in my hobby project where I'd like to be able to change my models
without losing all my test data. And I'm too lazy to do manual dumps and edit
the SQL in place before reimporting it.
I want a system
- that is transparent to the user: if my database is at schema
version 1, and my code is at version 3, I want it to be automatically
upgraded to version 3 on server startup.
- that is not too hard on the programmer: dropping a numbered Python or SQL
script in a directory ought to be sufficient to define a transition from
schema version X to schema version X+1.
- that handles errors gracefully: makes a backup of the database
with the old schema version; runs my script in a transaction and aborts
that transaction if the conversion fails (while showing me enough
information to debug the problem).
- allows prototyping without having to increment the schema number for every
little change I make to the models; I should be the one who decides that a new
schema is ready to go out to the world.
I've been glancing at SQLAlchemy-Migrate, since I've
been brought up to believe NIHing is
Bad. But Migrate is scary. I have to admit that the longer I stare
at its documentation, the less I can describe why I think so. All
those shell commands—but there's an API for invoking them from Python, so maybe I can
achieve my goals. I'll have to try and see.
Tue, 15 Sep 2009
Last time I
mentioned that running bin/buildout with the -N flag makes it run faster
(since it skips looking for newer versions to upgrade). You can tell
buildout to do this by default by putting 'newest = false' into the [buildout]
section of buildout.cfg. We'll be running bin/buildout a lot now, since we'll
be making changes to the project environment, so this will save wear and tear
on the '-', 'N' and Shift keys. (And, by the way, I'm not trying to soak up
Google juice by repeating the word 'buildout' a lot, honest!)
I will omit bzr commits from this narrative as it's getting long; you can
assume that every self-contained change was committed separately.
tests
First, I want a bin/test script to run the test
suite. Pylons uses nose, so we need to tell buildout to install the nosetests
script (under a different name, since I'm used to typing bin/test no matter
what test runner a project happens to use):
$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-15 19:49:11 +0000
+++ buildout.cfg 2009-09-15 19:49:18 +0000
@@ -8,5 +8,8 @@
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
+ nose
asharing
interpreter = python
+scripts = paster
+ nosetests=test
$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/paster'.
Generated script '/tmp/AlliterationSharing/bin/test'.
...
$ bin/test
----------------------------------------------------------------------
Ran 0 tests in 0.276s
OK
ctags
Documentation is good, but sometimes you want to look at the source code of
the framework. There's a tool called ctags that builds a database of
identifiers. The popular text editors Vim
and Emacs can then use the
tags database to jump to a definition of any name with a single keystroke
(Ctrl-] in vim, M-. in emacs).
Building the tags database is complicated by each Python package being
installed into a separate directory. There's a buildout recipe called
z3c.recipe.tag that finds those directories and lets you build a unified tags
file. We'll also ask buildout to make sure it unzips any packages
distributed as .egg files, since ctags doesn't process those:
$ bzr diff
@@ -1,8 +1,9 @@
[buildout]
develop = .
-parts = pylons
+parts = pylons ctags
newest = false
+unzip = true
[pylons]
recipe = zc.recipe.egg
@@ -13,3 +14,7 @@
interpreter = python
scripts = paster
nosetests=test
+
+[ctags]
+recipe = z3c.recipe.tag:tags
+eggs = ${pylons:eggs}
$ bin/buildout
...
Generated script '/tmp/AlliterationSharing/bin/ctags'.
...
$ bin/ctags
omelette
ctags lets you find classes and functions by name; it doesn't let you find
packages or modules. There's another recipe, collective.recipe.omelette that
creates a tree of symlinks mirroring the Python package structure (here
'unzip = true' also comes in handy):
$ bzr diff
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-15 20:04:42 +0000
+++ buildout.cfg 2009-09-15 20:05:30 +0000
@@ -1,6 +1,6 @@
[buildout]
develop = .
-parts = pylons ctags
+parts = pylons ctags omelette
newest = false
unzip = true
@@ -18,3 +18,7 @@
[ctags]
recipe = z3c.recipe.tag:tags
eggs = ${pylons:eggs}
+
+[omelette]
+recipe = collective.recipe.omelette
+eggs = ${pylons:eggs}
$ bin/buildout
...
$ ls -l parts/omelette
...
The symlink tree is created under parts/omelette/. For example, if you want
to see what webhelper tags were available, you can open
parts/omelette/webhelper/html/builder.py in your editor and see.
Makefile
This is getting long (and not everyone may be interested), but one long post is easier to skip than five
medium ones in a row, so I'll continue.
Wouldn't it be nice if new developers could check out your project and start
it up with just a couple of commands? Make is a time-tested tool that works
well for this:
$ cat Makefile
# Just remember that you need to use real tabs, not spaces, in a Makefile
PYTHON = python
.PHONY: all
all: bin/paster
.PHONY: run
run: bin/paster
bin/paster serve development.ini --reload
.PHONY: test check
test check: bin/test
bin/test
.PHONY: tags
tags: bin/ctags
bin/ctags
bin/paster bin/test bin/python bin/ctags: bin/buildout
bin/buildout
bin/buildout: bootstrap.py
$(PYTHON) bootstrap.py
Now all you need to do after checking out is run 'make' to set up a working
development environment. 'make run' or 'make test' will also do that, if
necessary, so this one-liner is sufficient to get a working Hello World
application on port 5000:
$ bzr branch lp:~mgedmin/+junk/AlliterationSharing && cd AlliterationSharing && make run
Try it! You'll get a Bazaar branch with all the history of this little
blog project.
Sun, 13 Sep 2009
For software development I prefer buildout to virtualenv. This is
because buildout has a text file describing the state of your working
environent, which can be versioned and used later to recreate it, as well
as during development to modify the environment slightly.
To start a new Pylons project, first create an empty directory. Let's
call our new project AlliterationSharing, because everybody is sick of 'foo'
and 'bar'.
$ mkdir -p ~/src/AlliterationSharing
$ cd ~/src/AlliterationSharing
Now create a file called buildout.cfg with the following content:
$ cat buildout.cfg
[buildout]
parts = pylons
[pylons]
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
interpreter = python
Download bootstrap.py to
it and run it to get bin/buildout. Note: you can chose which Python version you
want to use by running bootstrap.py with it. All other scripts under bin/
will be generated by buildout and will use the same Python interpreter.
$ wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py
$ python bootstrap.py
Creating directory '.../AlliterationSharing/bin'.
Creating directory '.../AlliterationSharing/parts'.
Creating directory '.../AlliterationSharing/eggs'.
Creating directory '.../AlliterationSharing/develop-eggs'.
Generated script '.../AlliterationSharing/bin/buildout'.
Run bin/buildout to install Pylons into your sandbox.
$ bin/buildout
Installing pylons.
Generated script '.../AlliterationSharing/bin/paster'.
Generated interpreter '.../AlliterationSharing/bin/python'.
Aside: buildout has this very nice feature where it can share Python
packages between projects. This will save you enormous amounts of time that
would otherwise be spent downloading and unpacking eggs. To make use of this
facility, create a file ~/.buildout/default.cfg with
$ cat ~/.buildout/default.cfg
[buildout]
eggs-directory = /home/mg/tmp/buildout-eggs
# XXX replace /home/mg with the full path of *your* home directory
# it would be much nicer if buildout let me use ~ or $HOME
# see https://bugs.launchpad.net/zc.buildout/+bug/190260
Another useful trick is to pass the -N flag to bin/buildout, which will tell
it not to bother looking for newer versions of packages on the Internet when
there's already an existing version installed in your eggs directory.
Back to business: now you've got two new scripts: bin/python and bin/paster.
You can use the first one to play with the interactive Python console where you
can now import pylons and all the dependencies; it has no other value.
Now is a good point to add the files you've created into a version control
system. I'll arbitrarily use Bazaar.
$ bzr init .
$ bzr add bootstrap.py buildout.cfg
$ bzr ignore bin parts eggs develop-eggs .installed.cfg
$ bzr commit -m "Create AlliterationSharing project"
Run bin/paster create -t pylons to create a skeleton project.
$ bin/paster create -t pylons asharing
$ bzr ignore *.egg-info
$ bzr add asharing
$ bzr commit -m "Generated project files with paster create"
Now paster creates a directory structure that I don't like:
AlliterationSharing/
buildout.cfg
bin/
asharing/
setup.py
README.txt
MANIFEST.in
asharing/
__init__.py
config/
controllers/
templates/
public/
I'd like the README and setup.py to be in the top level, and I dislike
repeating 'asharing' twice in directory names. I'll move some files around
$ cd asharing/
$ bzr mv development.ini docs MANIFEST.in README.txt setup.* test.ini ../
$ bzr rm ez_setup.*
$ cd ..
$ bzr mv asharing src
$ bzr ci -m "Moved some files around"
Now the tree looks like this:
AlliterationSharing/
buildout.cfg
setup.py
README.txt
MANIFEST.in
bin/
src/
asharing/
__init__.py
config/
controllers/
templates/
public/
We have to tell setup.py where to find the source tree
$ bzr diff
=== modified file 'MANIFEST.in'
--- MANIFEST.in 2009-09-13 13:04:00 +0000
+++ MANIFEST.in 2009-09-13 13:05:59 +0000
@@ -1,3 +1,3 @@
-include asharing/config/deployment.ini_tmpl
-recursive-include asharing/public *
-recursive-include asharing/templates *
+include src/asharing/config/deployment.ini_tmpl
+recursive-include src/asharing/public *
+recursive-include src/asharing/templates *
=== modified file 'setup.py'
--- setup.py 2009-09-13 13:04:00 +0000
+++ setup.py 2009-09-13 13:04:40 +0000
@@ -17,7 +17,8 @@
"SQLAlchemy>=0.5",
],
setup_requires=["PasteScript>=1.6.3"],
- packages=find_packages(exclude=['ez_setup']),
+ packages=find_packages('src', exclude=['ez_setup']),
+ package_dir={'': 'src'},
include_package_data=True,
test_suite='nose.collector',
package_data={'asharing': ['i18n/*/LC_MESSAGES/*.mo']},
(I'm not sure if you also need to change package_data and/or setup.cfg; it's
possible that I left i18n in a broken state. Can somebody comment on
this?)
And we have to tell buildout that we've got a new Python package to enable
in the project environment
$ bzr diff buildout.cfg
=== modified file 'buildout.cfg'
--- buildout.cfg 2009-09-13 12:57:21 +0000
+++ buildout.cfg 2009-09-13 13:08:05 +0000
@@ -1,8 +1,10 @@
[buildout]
+develop = .
parts = pylons
[pylons]
recipe = zc.recipe.egg
eggs = Pylons
PasteScript
+ asharing
interpreter = python
Now you can re-run bin/buildout and start your hello-world project
$ bzr commit -m "Include the new package in the build"
$ bin/buildout -N
$ bin/paster serve --reload development.ini
Happy hacking!
To be continued: telling buildbot to create bin/test; using ctags and omelette.