Posted by: julf
Any ext3 gurus out there? - 17/12/2004 16:56
Arrrgghh... The classical case - a lot of work in setting up a system, twiddling everything in place, coaxing all pieces of software to work together - and finally, after weeks of work, just before doing a major backup, the system crashes and does not boot. Seems like a hard failure of disk (getting "partial read") in the ext3 filesystem journal. Any hints? good tools?
Posted by: Daria
Re: Any ext3 gurus out there? - 17/12/2004 17:45
i suppose force-mounting it ext2 is right out?
Posted by: tman
Re: Any ext3 gurus out there? - 19/12/2004 00:14
How about getting an identical or bigger disk and just doing a sector by sector copy? At least that way the sectors with errors will just be blanked out and fsck should be able to do a better job.
Posted by: mlord
Re: Any ext3 gurus out there? - 05/01/2005 13:49
Hi Julf,
The message means pretty much what it says: "UncorrectableError" is a hard sector failure on the media. This can often be corrected by overwriting the entire disk with, say, zeros: cat /dev/zero >/dev/hda
As part of the overwrite, the drive firmware will fix the errors and try to remap the bad sectors automatically.
But first, it would be good to have a look at the S.M.A.R.T. logs, and run a low-level drive test. The S.M.A.R.T. data may tell WHY the sectors went bad, but most likely it won't.
Under Linux, smartmontools are needed, and the "smartctl" command in particular.
The IBM Drive Fitness Test (self-booting diskette image) does the same stuff, and can also low-level reformat IBM drives.
Cheers
Posted by: mlord
Re: Any ext3 gurus out there? - 05/01/2005 15:21
Yes, the S.M.A.R.T. log info about drive temperature could be very interesting to look at!
The other stuff DriveReady SeekComplete Error from my driver message is just an english translation of the low-level status bits, in this case the only important one is Error, which is then expanded upon on the following line UncorrectableError (media error).
Cheers
Posted by: mlord
Re: Any ext3 gurus out there? - 06/01/2005 12:14
The kernel normally buffers pretty much everything. If one uses O_DIRECT when opening files, then the page cache will be bypassed, and I/O will happen directly on the userspace buffers. This is great for doing bulk copies and the like. But the kernel still batches sequential sectors together, as otherwise I/O throughput would be terrible (think, 1MB/sec rather than 80MB/sec from a modern drive).
For sector-by-sector copy/recovery of a partially bad disk, I prefer to use a very low-level driver API to force sector-at-a-time whenever an error is reported. I wrote a throwaway tool for this once, but have lost it. It used the IDE_TASKFILE interface (ioctls).
Cheers