It went like this:

1. Disks are mounted read-write.
2. We delete the /lost+found directory.
3. We mount the disk read-only.
4. We run fsck.
5. fsck recreates the /lost+found directory.
6. We mount read-write.
7. Disk corruption results.

The problem seems to be caused by the kernel not updating its directory cache for the root directory. When fsck recreates lost+found, the two get out of sync. Then you remount read-write, and the kernel modifies its cached copy, and then flushes it. Boom.

Now, this sequence of events is so rare as to be extremely unlikely to happen to anyone else, but I'd be aware of it anyway -- simply remounting the disks ro doesn't cause the kernel to flush its changes out, or to discard its cached copy.

This particular bug resulted in Steve Sanders losing all of his music -- the first lot of disk corruption caused _everything_ to be moved into lost+found, which we then emptied. Whoops.
_________________________
-- roger