Random notes from mg

a blog by Marius Gedminas

Marius is a Python hacker. He works for Programmers of Vilnius, a small Python/Zope 3 startup. He has a personal home page at http://gedmin.as. His email is marius@gedmin.as. He does not like spam, but is not afraid of it.

Thu, 30 Dec 2004

Profiling/tracing a single function

Sometimes you want to profile just a single function in your Python program. Here's a module that lets you do just that: profilehooks.py. Sample usage:

#!/usr/bin/python
from profilehooks import profile

class SampleClass:

    def silly_fibonacci_example(self, n):
        """Return the n-th Fibonacci number.

        This is a method rather rather than a function just to illustrate that
        you can use the 'profile' decorator on methods as well as global
        functions.

        Needless to say, this is a contrived example.
        """
        if n < 1:
            raise ValueError('n must be >= 1, got %s' % n)
        if n in (1, 2):
            return 1
        else:
            return (self.silly_fibonacci_example(n - 1) +
                    self.silly_fibonacci_example(n - 2))
    silly_fibonacci_example = profile(silly_fibonacci_example)


if __name__ == '__main__':
    fib = SampleClass().silly_fibonacci_example
    print fib(10)

(If you have Python 2.4, you can use @profile as a decorator just before the function definition instead of rebinding silly_fibonacci_example.)

Demonstration:

mg: ~$ python sample.py
55

*** PROFILER RESULTS ***
silly_fibonacci_example (sample.py:6)
function called 109 times

         325 function calls (5 primitive calls) in 0.004 CPU seconds

   Ordered by: internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    108/2    0.001    0.000    0.004    0.002 profilehooks.py:79(<lambda>)
    108/2    0.001    0.000    0.004    0.002 profilehooks.py:131(__call__)
    109/1    0.001    0.000    0.004    0.004 sample.py:6(silly_fibonacci_example)
        0    0.000             0.000          profile:0(profiler)

This decorator is useful when you do not want the profiler output to include time spent waiting for user input in interactive programs, or time spent waiting for requests in a network server.

In a similair vein you can produce code coverage reports for a function.

#!/usr/bin/python
import doctest
from profilehooks import coverage

def silly_factorial_example(n):
    """Return the factorial of n."""
    if n < 1:
        raise ValueError('n must be >= 1, got %s' % n)
    if n == 1:
        return 1
    else:
        return silly_factorial_example(n - 1) * n
silly_factorial_example = coverage(silly_factorial_example)


if __name__ == '__main__':
    print silly_factorial_example(1)

Demonstration:

mg: ~$ python sample2.py
1

*** COVERAGE RESULTS ***
silly_factorial_example (sample2.py:5)
function called 1 times

       def silly_factorial_example(n):
           """Return the factorial of n."""
    1:     if n < 1:
>>>>>>         raise ValueError('n must be >= 1, got %s' % n)
    1:     if n == 1:
    1:         return 1
           else:
>>>>>>         return silly_factorial_example(n - 1) * n

2 lines were not executed.

I found it useful to discover whether a given function or a method was adequately covered by unit tests.

Update: profilehooks is now a proper easy_install'able Python package.

posted at 00:16 | tags: | permanent link to this entry | 5 comments

Sat, 11 Dec 2004

Diffing dicts

Say you are comparing two large dicts in a unit test, for example:

    form = extract_form(rendered_html)
    self.assertEquals(form, {'field1': u'value1',
                             'field2': u'value2',
                             ...
                             'field42': u'value42'})

When this test fails, a useful trick is to ask the test runner to drop into Pdb inside assertEquals (SchoolTool and Zope 3 test runners have a command line option -d for this) and type the following:

(Pdb) from sets import Set
(Pdb) pp list(Set(first.items()) ^ Set(second.items()))

You will get a list of (key, value) pairs that differ:

[('field.comp.c3.b.NEW', u''),
 ('field.comp.c1.b.NEW', u''),
 ('field.comp.c1.title', 'New stuff'),
 ('field.comp.c1.b.b2', u'A2'),
 ('field.comp.c3.b.b1', u'New behaviour'),
 ('field.comp.c3.title', u'New stuff'),
 ('SUBMIT', u'Save'),
 ('field.comp.c1.title', u'Comp 1'),
 ('field.comp.c2.b.b1', 'B1'),
 ('field.comp.c3.description', u'New description'),
 ('field.comp.c2.description', 'Comp two'),
 ('field.comp.c1.b.b1', u'A1'),
 ('field.comp.c2.title', 'Comp 2'),
 ('SUBMIT', 'Submit'),
 ('field.comp.c2.b.b2', 'B2'),
 ('field.comp.c1.description', 'New description'),
 ('field.comp.c1.description', u'Comp one'),
 ('field.comp.c1.b.b1', 'New behaviour')]

If there are many differences, sorting the list is a good idea

(Pdb) sorted = lambda l: (l.sort(), l)[1]
(Pdb) pp sorted(list(Set(first.items()) ^ Set(second.items())))

Python 2.4 makes this simpler (builtin set, builtin sorted).

posted at 12:46 | tags: | permanent link to this entry | 0 comments