Update: The story continues, but solution is not in sight yet.
I upgraded a buildbot slave to Ubuntu 8.04 (Hardy) recently and now I'm getting a strange intermittent failure: sometimes cp -r /local/dir /nfs/mounted/dir fails ("process killed by signal 1", i.e. SIGHUP).
I wonder if NFS is relevant or incidental to the issue?
Google finds an old thread from 2005, with a workaround (usepty=False), but I'd like to understand the problem before applying random fixes.
So far three different build steps doing cp -r have failed during 10 days. I've now changed them all to cp -rv, so I can at least see if the failure is in the middle of the copy or at the end, if it fails again.
Update: so far 4 build steps have failed on 6 separate occasions:
May 5 02:31: cp -r local-dir1 nfs-mounted-dir1 May 6 02:31: cp -r local-dir1 nfs-mounted-dir1 May 6 04:33: cp -r local-dir2 nfs-mounted-dir2 May 15 02:00: cp -r local-dir3 nfs-mounted-dir3 May 17 04:32: rm -rf nfs-mounted-dir4 May 20 04:31: rm -rf nfs-mounted-dir4
I see no particular correlation between step duration and results, e.g. the rm -rf step usually takes between 2.2 and 4.6 seconds. The two SIGHUPs happened after 2.4 seconds.
They all make no output. When I changed the cp steps and added a -v, they stopped failing, but that could be just a coincidence.
We're having an email conversation with Jean-Paul Calderone ("exarkun") about the possibility of this being PTY-related, with no clear resolution so far.
And, hey, now this blog supports comments ;)