Mailman is a wonderful
mailing list manager, but when you have thousands of spam messages sitting in
the moderation queue, it's web interface is not enough.
The messages live as Python pickles on the file system, in the mailman data
directory. The file name pattern is
heldmsg-listname-number.pck. Newer versions of
Mailman come with a script discard that
takes a list of path names on the command line and discards them all. In other
words, to get rid of all held messages all you have to do is type
/usr/lib/mailman/bin/discard /var/lib/mailman/data/heldmsg*
(you may have to change the directory names to suit your mailman
installation).
However I want to be really sure that the messages I'm discarding are spam.
The most straightforward way to do that is to extract the RFC 2822 messages
from Mailman's pickles, and pipe them to spamassassin. I could not find
a script for message extraction included with Mailman, so I had to write my
own (mmextract.py):
#!/usr/bin/env python
"""
Extract an email message from a Mailman pickle.
Usage: mmextract.py filename > outputfile
"""
import sys
import cPickle
sys.path.insert(0, '/usr/lib/mailman')
def main(argv=sys.argv):
if len(argv) < 2:
print __doc__
return
msg = cPickle.load(open(argv[1]))
print msg.as_string()
if __name__ == '__main__':
main()
The rest is a matter of simple shell scripting:
for fn in /var/lib/mailman/data/heldmsg*; do
./mmextract.py $fn | spamassassin -L -e > /dev/null || echo $fn
done | xargs /usr/lib/mailman/bin/discard
(untested, but it should work).
The most annoying subversion error message:
$ svn up
...
svn: Won't delete locally modified directory 'foo/bar'
svn: Left locally modified or unversioned files
This happens when a subdirectory of 'foo/bar' is removed from
the upstream repository, and subversion tries and fails to remove ir locally --
fails because it finds some some files that are listed in svn:ignore (e.g.
editor backup files, compiled object files, compiled Python modules).
Now you have to figure out which subdirectory of 'foo/bar' subversion wants
to remove. Then you have to manually remove junk files from it. Then you
have to repeatedly try various combinations of svn up and
svn cleanup until Subversion finally agrees to continue the
interrupted svn up operation.
Please, Subversion folks -- if you cannot delete a directory because it
contains just junk files (those ignored in the output of svn
status), just print a meaningful warning message (and name the correct
directory rather than its parent!) and continue with the update.
Debian bug
246131.
Update: It seems to be fixed in
Subversion 1.1. Unfortunately Subversion 1.1 is not in Debian.
If you want to set up a shared Subversion repository, accessible over
SSH, you need to make the following three directories group-writeable (and
setgid):
- /path/to/svn/repository/db
- /path/to/svn/repository/locks
- /path/to/svn/repository/dav (not sure about this, it's likely that it is
not necessary if you only want SSH access)
You also need to make sure that all user accounts that access the repository
have the correct umask (002 instead of the default 022). If you do not do
that, the repository will break when two different developers access it, and
you'll have to go fix the permissions and run svnadmin recover.
Setting the umask is tricky because there are a lot of places where you
think you could set it, but most of them do not work. Also, testing is
difficult because interactive SSH sessions act differently from noninteractive
ones. Here are some red herrings:
- Debian's /etc/.bash_profile claims that "the default umask is set in
/etc/login.defs", but SSH sessions apparenly completely ignore
/etc/login.defs.
- /etc/profile is ignored in non-interactive SSH sessions.
- Creating a wrapper for svnserve that sets the umask and executes the
default does not solve the problem if you put the wrapper in
/usr/local/bin, because /usr/bin comes first in the default PATH setting.
Again, the PATH definition in /etc/login.defs is ignored for SSH
sessions, so you have to fiddle with bash startup files -- and if you do
that, you might as well simply set the umask.
The correct solution is to put umask 002 in
/etc/bash.bashrc, and make sure that user's .bashrc files do not override
it.
Sending a properly encoded email that contains non-ASCII characters is not
as trivial as it should be. Here's more or less what I want:
sender = u'Sender \u263A <sender@example.com>'
recipient = u'Recipient \u263B <recipient@example.com>'
subject = u'Smile! \u263A'
body = u'Smile!\n\u263B'
send_email(sender, recipient, subject, body)
The hard part is getting all the unicode strings to be properly encoded in
the email. Details like multiple recipients, additional headers, attachments,
SMTP configuration and error handling are ignored for the purposes of this
article.
Here's the solution:
from smtplib import SMTP
from email.MIMEText import MIMEText
from email.Header import Header
from email.Utils import parseaddr, formataddr
def send_email(sender, recipient, subject, body):
"""Send an email.
All arguments should be Unicode strings (plain ASCII works as well).
Only the real name part of sender and recipient addresses may contain
non-ASCII characters.
The email will be properly MIME encoded and delivered though SMTP to
localhost port 25. This is easy to change if you want something different.
The charset of the email will be the first one out of US-ASCII, ISO-8859-1
and UTF-8 that can represent all the characters occurring in the email.
"""
header_charset = 'ISO-8859-1'
for body_charset in 'US-ASCII', 'ISO-8859-1', 'UTF-8':
try:
body.encode(body_charset)
except UnicodeError:
pass
else:
break
sender_name, sender_addr = parseaddr(sender)
recipient_name, recipient_addr = parseaddr(recipient)
sender_name = str(Header(unicode(sender_name), header_charset))
recipient_name = str(Header(unicode(recipient_name), header_charset))
sender_addr = sender_addr.encode('ascii')
recipient_addr = recipient_addr.encode('ascii')
msg = MIMEText(body.encode(body_charset), 'plain', body_charset)
msg['From'] = formataddr((sender_name, sender_addr))
msg['To'] = formataddr((recipient_name, recipient_addr))
msg['Subject'] = Header(unicode(subject), header_charset)
smtp = SMTP("localhost")
smtp.sendmail(sender, recipient, msg.as_string())
smtp.quit()
I wish I could write it like this:
from smtplib import SMTP
from email.MIMEText import MIMEText
def send_email(sender, recipient, subject, body):
"""Science-fictional simple version of send_email."""
msg = MIMEText(body)
msg['From'] = sender
msg['To'] = recipient
msg['Subject'] = subject
smtp = SMTP("localhost")
smtp.sendmail(sender, recipient, msg.as_string())
smtp.quit()