Random notes from mg

a blog by Marius Gedminas

Marius is a Python hacker. He works for Programmers of Vilnius, a small Python/Zope 3 startup. He has a personal home page at http://gedmin.as. His email is marius@gedmin.as. He does not like spam, but is not afraid of it.

Sat, 07 Aug 2010

Profiling with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more. I've talked about log capturing, now it's time for

Profiling

This WSGI middleware profiles every request with the cProfile module. To see the profiles, visit a hidden URL /_profiler/showall:

List of profiles

What you see here is heavily tweaked in my fork branch of Dozer; upstream version had no Cost column and didn't vary the background of Time by age (that last bit helps me see clumps of requests).

Here's what an individual profile looks like:

One profile

The call tree nodes can be expanded and collapsed by clicking on the function name. There's a hardcoded limit of 20 nesting levels (upstream had a limit of 15), sadly that appears not to be enough for practical purposes, especially if you start profiling Zope 3 applications...

You can also take a look at the WSGI environment:

WSGI environment expanded

Sadly, nothing about the response is captured by Dozer. I'd've liked to show the Content-Type and perhaps Content-Length in the profile list.

The incantation in development.ini is

[filter-app:profile]
use = egg:Dozer#profile
profile_path = /tmp/profiles
next = main

Create an empty directory /tmp/profiles and make sure other users cannot write to it. Dozer stores captured profiles as Python pickles, which are insecure and allow arbitrary command execution.

To enable the profiler, run paster like this:

$ paster serve development.ini -n profile

Bonus feature: call graphs

Dozer also writes a call graph in Graphviz "dot" format in the profile directory. Here's the graph corresponding to the profile you saw earlier, as displayed by the excellent XDot:

Call graph

See the fork where the "hot" red path splits into two?

Call graph, zoomed in

On the left we have Routes deciding to spend 120 ms (70% total time) recompiling its route maps. On the right we have the actual request dispatch. The actual controller action is called a bit further down:

Call graph, zoomed in

Here it is, highlighted. 42 ms (24% total time), almost all of which is spent in SQLAlchemy, loading the model object (a 2515 byte image stored as a blob) from SQLite.

A mystery: pickle errors

When I first tried to play with the Dozer profiler, I was attacked by innumerable exceptions. Some of those were due to a lack of configuration (profile_path) or invalid configuration (directory not existing), or not knowing the right URL (going to /_profiler raised TypeError). I tried to make Dozer's profiler more forgiving or at least produce clearer error messages in my fork branch, e.g. going to /_profiler now displays the profile list.

However some errors were very mysterious: some pickles, written by Dozer itself, could not be unpickled. I added a try/except that put those at the end of the list, so you can see and delete them.

Pickle errors

Does anybody have any clues as to why profile.py might be writing out broken pickles?

Update: as Ben says in the comments, my changes have been accepted upstream. Yay!

posted at 06:06 | tags: | permanent link to this entry | 5 comments

Capturing logs with Dozer

Dozer is mostly known for its memory profiling capabilities, but the as-yet unreleased version has more:

Log capturing

This WSGI middleware intercepts logging calls for every request. Here we see a toy Pylons application I've been working on in my spare time. Dozer added an info bar at the top:

Dozer's infobar

When you click on it, you get to see all the log messages produced for this request. I've set SQLAlchemy's loglevel to INFO in my development.ini, which produces:

Dozer's log viewer

(Why on Earth does SQLAlchemy think I want to see the memory address of the Engine object in my log files, I don't know. The parentheses contain argument values for parametrized queries, of which there are none on this page.)

Upstream version displays absolute timestamps (of the YYYY-MM-DD HH:MM:SS.ssssss variety) in the first column; my fork shows deltas in milliseconds. The incantation in development.ini is

[filter-app:logview]
use = egg:Dozer#logview
next = main

which makes it disabled by default. To enable, you run paster like this:

$ paster serve development.ini -n logview

(Upstream version lacks the paste entry point for logview; it's in my fork, for which I submitted a pull request weeks ago like a good open-source citizen. Incidentally, patches for stuff I maintain have been known to languish for years in my inbox, so I'm not one to throw stones.)

Next: profiling with Dozer.

Update: Tom Longson blogged about this back in 2008! And his CSS is prettier.

posted at 05:27 | tags: | permanent link to this entry | 5 comments

Fri, 06 Aug 2010

irclog2html is now on PyPI

irclog2html, the IRC log to HTML converter, is now (finally!) available from the Python Package Index.

In other news, logs2html now copies irclog.css to the destination directory (if it doesn't exist there already). I've been noticing logs produced with irclog2html on random places, and sometimes they were unstyled; hopefully this will become rare now.

posted at 04:32 | tags: | permanent link to this entry | 0 comments