a blog by Marius Gedminas

Nokia N950

Last Thursday I received a package containing something called the Nokia N950 development kit. Sweet sweet hardware, shame it's not going to be sold to end users. The software is visibly an unfinished pre-release version, but shows great potential. There are almost no 3rd-party apps, which is why Nokia is loaning these N950s to random developers.

I intend to port GTimeLog to it. Although my more immediate need is to have FBReader, so that I can stop carrying both this one and my N900 with me everywhere. Also, vim would be nice.

I've already hacked up Lithuanian support to the virtual and hardware keyboards, thanks to the very nice design of Maliit. As a comparison, I've had my N900 for a year and a half, and I still can't type Lithuanian on it. XKB is not fun.

Stuff I've been doing recently

objgraph got

zodbbrowser got

  • support for all ZODB databases, not just those with a Zope 3/Bluebream-style root folder/local site.
  • the ability to cope better with broken objects (due to the way ZODB works, not having some of your modules on the Python pack can break unpickling; zodbbrowser now handles this kind of situation better).
  • assorted smaller improvements.
  • a slow but inevitable shift of focus from "use it as a plugin for your Zope 3 app" to "it's a standalone tool for inspecting ZODB contents". (Both use cases are still supported, and will be for the foreseeable future.)

imgdiff got

  • its first public release.
  • some experimental code to actually find and highlight the differing parts of the images:

    example of highlighted image differences

    This works better when both images are the same size, although there's experimental (and somewhat buggy) code to try all possible alignments. I could use some help here; image processing is not something I'm familiar with, and searching StackOverflow didn't help beyond reminding me of the existence of PIL's ImageChops.difference(), which is for same-sized images only. Many of the results there are about comparing photos, where things like lighting levels matter. What I need is a diff for computer-generated images, where some things may be shifted around a bit, by different amounts, but are essentially the same. Are there any two-dimensional sequence diff algorithms?

Profiling with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more. I've talked about log capturing, now it's time for

Profiling

This WSGI middleware profiles every request with the cProfile module. To see the profiles, visit a hidden URL /_profiler/showall:

List of profiles

What you see here is heavily tweaked in my fork branch of Dozer; upstream version had no Cost column and didn't vary the background of Time by age (that last bit helps me see clumps of requests).

Here's what an individual profile looks like:

One profile

The call tree nodes can be expanded and collapsed by clicking on the function name. There's a hardcoded limit of 20 nesting levels (upstream had a limit of 15), sadly that appears not to be enough for practical purposes, especially if you start profiling Zope 3 applications...

You can also take a look at the WSGI environment:

WSGI environment expanded

Sadly, nothing about the response is captured by Dozer. I'd've liked to show the Content-Type and perhaps Content-Length in the profile list.

The incantation in development.ini is

[filter-app:profile]
use = egg:Dozer#profile
profile_path = /tmp/profiles
next = main

Create an empty directory /tmp/profiles and make sure other users cannot write to it. Dozer stores captured profiles as Python pickles, which are insecure and allow arbitrary command execution.

To enable the profiler, run paster like this:

$ paster serve development.ini -n profile

Bonus feature: call graphs

Dozer also writes a call graph in Graphviz "dot" format in the profile directory. Here's the graph corresponding to the profile you saw earlier, as displayed by the excellent XDot:

Call graph

See the fork where the "hot" red path splits into two?

Call graph, zoomed in

On the left we have Routes deciding to spend 120 ms (70% total time) recompiling its route maps. On the right we have the actual request dispatch. The actual controller action is called a bit further down:

Call graph, zoomed in

Here it is, highlighted. 42 ms (24% total time), almost all of which is spent in SQLAlchemy, loading the model object (a 2515 byte image stored as a blob) from SQLite.

A mystery: pickle errors

When I first tried to play with the Dozer profiler, I was attacked by innumerable exceptions. Some of those were due to a lack of configuration (profile_path) or invalid configuration (directory not existing), or not knowing the right URL (going to /_profiler raised TypeError). I tried to make Dozer's profiler more forgiving or at least produce clearer error messages in my fork branch, e.g. going to /_profiler now displays the profile list.

However some errors were very mysterious: some pickles, written by Dozer itself, could not be unpickled. I added a try/except that put those at the end of the list, so you can see and delete them.

Pickle errors

Does anybody have any clues as to why profile.py might be writing out broken pickles?

Update: as Ben says in the comments, my changes have been accepted upstream. Yay!

Capturing logs with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more:

Log capturing

This WSGI middleware intercepts logging calls for every request. Here we see a toy Pylons application I've been working on in my spare time. Dozer added an info bar at the top:

Dozer's infobar

When you click on it, you get to see all the log messages produced for this request. I've set SQLAlchemy's loglevel to INFO in my development.ini, which produces:

Dozer's log viewer

(Why on Earth does SQLAlchemy think I want to see the memory address of the Engine object in my log files, I don't know. The parentheses contain argument values for parametrized queries, of which there are none on this page.)

Upstream version displays absolute timestamps (of the YYYY-MM-DD HH:MM:SS.ssssss variety) in the first column; my fork shows deltas in milliseconds. The incantation in development.ini is

[filter-app:logview]
use = egg:Dozer#logview
next = main

which makes it disabled by default. To enable, you run paster like this:

$ paster serve development.ini -n logview

(Upstream version lacks the paste entry point for logview; it's in my fork, for which I submitted a pull request weeks ago like a good open-source citizen. Incidentally, patches for stuff I maintain have been known to languish for years in my inbox, so I'm not one to throw stones.)

Next: profiling with Dozer.

Update: Tom Longson blogged about this back in 2008! And his CSS is prettier.