a blog by Marius Gedminas

Accidentally Hugo

I’ve converted my blog to Hugo. I meant to switch from Pyblosxom to something else for a while now, due to the pain (e.g. today I noticed most of my blog pages failing with 500 due to I don’t even want to know why in the comments plugin).

I’m sure Hugo will cause me pain too (it already has), but at least it’ll be new, different pain.

Comments will live in Disqus. I’ve imported the old ones (again, with some pain). Apologies for the mangled names, which I’m sure aren’t my fault. Maybe I’ll fix them someday.

I’ve tried to keep old links working, but Hugo makes it kind of difficult. sigh

Computers are made of pain.

Rackspace OSS

I've a Jenkins server running tests for most of my open-source projects now: https://jenkins.gedmin.as/. It was not too difficult to set up:

  1. Send an email to Jesse Noller about that free Rackspace Cloud hosting for OSS projects, expecting to be rebuffed because all my projects are small and insignificant.
  2. Receive a positive response with the instructions. Yay!
  3. Sign up for an account, type in all my credit card details and personal phone numbers, get it verified by a phone call (I hate phone calls, but apparently it prevents fraud or something). Send the account name/ID back to Jesse.
  4. Log in to the MyCloud account control panel/dashboard/thingie, create a server, ask for Ubuntu 12.04 LTS.
  5. Wait a couple of minutes, ssh as root, do the usual setup (locale-gen, adduser, ~/.ssh/authorized_keys, dotfiles, etckeeper, postfix, my PPA for sysadmin automation stuff).
  6. There was an amusing interlude where I tried to set up the Rackspace Cloud Monitoring agent according to their instructions, failed, had to open a support ticket, give their techs a login with root, then wait less than 24 hours until it was resolved. I still don't know what went wrong, but it works now.
  7. Get and install a free SSL certificate from StartSSL.
  8. Install and configure Jenkins (latest upstream version, just in case).
  9. Point my DNS to the server's IP.

It's been running this way for a month now, with no problems. So, thanks, Rackspace!

Grand plans for the future: have Jenkins do daily/weekly mirrors of all my GitHub repos, look for commits made since the last release tag, filter out boring ones ("bump version number"), and send me reminders that "project X needs a new release: you've new features sitting there unreleased for X days now". Or maybe just a web page with a table of "X commits since last release" linking to GitHub history.

P. S. I can't preview this since PyBlosxom fails to run on my laptop with a cryptic error. I'm debating debugging this versus migrating to a static blog compiler.

Adding "Edit on GitHub" links to Sphinx pages

Documentation pages on ReadTheDocs have a nice sidebar with extra "Show on GitHub" and "Edit on GitHub" links. Here's how you can have those for your own Sphinx documentation:

  1. Create _ext and _templates subdirectories.
  2. Create a file _ext/edit_on_github.py that hooks into html-page-context to add a couple of template variables.
  3. Create a file _templates/sourcelink.html that uses those two variables to insert "Show on GitHub" and "Edit on GitHub" links in the sidebar.
  4. Edit conf.py and add os.path.abspath('_ext') to sys.path.
  5. Add edit_on_github to the list of extensions.
  6. Use edit_on_github_project to specify your GitHub username and repository (separated by a slash).
  7. Optionally use edit_on_github_branch to specify the desired branch (it defaults to 'master').
  8. Make sure _templates is in the templates path.
  9. make html and enjoy

screenshot

Logging levels and hierarchies

I remember when the logging package seemed big and complicated and forbidding. And then I remember when I finally "got" it, started using it, even liked it. And then I've discovered that I didn't really understand the model after all.

Consider this: we have two loggers

root
  \
   -- mylogger

configured as follows:

import logging
root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(logging.FileHandler("info.log"))
mylogger = logging.getLogger('mylogger')
mylogger.setLevel(logging.DEBUG)
mylogger.addHandler(logging.FileHandler("debug.log"))

What happens when I do mylogger.debug('Hi')?

Answer: the message appears in both debug.log and info.log.

That was surprising to me. I'd always thought that a logger's level was a gatekeeper to all of that logger's handlers. In other words, I always thought that when a message was propagating from a logger to its parent (here from mylogger to root), it was also being filtered against the parent's log level. That turns out not to be the case.

What actually happens is that a message is tested against the level of the logger where it was initially logged, and if it passes the check, it gets passed to all the handlers of that logger and all its ancestors with no further checks. Unless the propagation is stopped somewhere by one of the loggers having propagate set to False. And of course each handler has its own level filtering. And I'm ignoring filters and the global level override.

Part of the confusion was caused by my misunderstanding of logging.NOTSET. I assumed, incorrectly, that it was just a regular logging level, one even less severe than DEBUG. So when I wrote code like this:

import logging
root = logging.getLogger()
root.setLevel(logging.INFO)
root.addHandler(logging.FileHandler("info.log"))
mylogger = logging.getLogger('mylogger')
mylogger.setLevel(logging.NOTSET)  # <-- that's the default level, actually
mylogger.debug("debug message")

I saw the debug message being suppressed and assumed it was because of the root logger's level. Which is correct, in a way, just not the way I thought about it.

NOTSET does not mean "pass all messages through", it means "inherit the log level from the parent logger". The documentation actually describes this, although in a rather convoluted way. My own fault for misunderstanding it, I guess.

trace.py and os.fork()

Here's a simple Python script:

import subprocess
with open('hi.txt', 'w') as f:
  subprocess.Popen(['echo', 'Hello'], stdout=f).wait()

Here's what it does:

$ python2.7 hi.py
$ cat hi.txt
Hello

And here's what happens if you try to use the trace module to trace the execution:

$ python2.7 -m trace --trace hi.py
(lots of trace output omitted)

So far so good, but take a look at hi.txt:

$ cat hi.txt
subprocess.py(1170):                             _dup2(errwrite, 2)
 --- modulename: subprocess, funcname: _dup2
subprocess.py(1164):                                 if a == b:
subprocess.py(1166):                                 elif a is not None:
... 154 lines of unexpected trace output omitted ...
os.py(379):         try:
os.py(380):             func(fullname, *argrest)
Hello

The tracing hook installed by trace.py survives the os.fork() and interferes with subprocess output redirection. Now you know.

Imagine what this does to scripts that execute shell commands and try to parse their output.

Who wants to file a bug and/or provide a patch? I'm afraid I don't have enough round tuits...