Boy, am I glad that I didn't have time to inject myself in this thread before. :-) I can see where you are coming from Jeff, but I think Monty is dead on with his answer too. Fortunately, I've met Billy IRL and in IRC, so I would have a clue if he could follow a technical answer. I totally didn't get the idea of the 10 story drop test other than humor, but others were trying to mollify the paranoia of data failure from your average hard drive and to explain a bit of what the manufacturers do to test their own products. After understanding that, you may not want to go to the trouble of testing on your own. This is why backups are so important, because all hardware is failure prone and so is software.
That said, there are some interesting utilities in the System Rescue CD distro. It is a Live CD using Gentoo and a highly customized boot that allows you to boot to DBAN and Derik's Boot and Nuke among other tiny images. The standard Sysrescuecd boot takes you to a basic Linux with many disk tools at your service. I've used the distro to rescue a drive partition table that got messed up with a failed Linux install. YMMV.
There is a decent wiki at www.sysresccd.org with documentation. http://www.sysresccd.org/System-tools
Test-disk look to be one program you could use from Sysrescuecd. There are other LiveCDs that include this program and other system recovery tools, Knoppix, Gparted LiveCD, and even the Ultimate Boot CD. http://www.cgsecurity.org/wiki/TestDisk http://www.cgsecurity.org/wiki/SMART_Monitoring
Also check out Aida a powerful hardware enumeration/diagnostic/discovery tool, like Sandra.
On 7/2/07, Monty J. Harder <> wrote:
On 7/1/07, Jeffrey McCright <> wrote:
Ladies and Gentlemen, TAKE A HINT! This is why newbies don't look long to the KCLUG for support, and quite possibly one of the reasons why people don't give Linux a second look. I have several friends whom I have
directed
to the KCLUG and after posting one or two questions to the list, have
given
up on the list. I no longer point non-technical people to the list as they get frustrated and offended. There it is.
I don't think a person who can write this (emphasis mine) is a 'newbie'.
Can anyone recommend a program or method of stress testing hard
drives? I check memory/cpu with memtest86, but I would like some way to stress test a hard drive. Currently I dd urandom over it for a few days or DBAN it, but I'm looking for something more thought out. Preferably, a program that can run on a live system so I'd just attach the drive to be tested, and point the test at /dev/sdd or whatever dev it was on. Preserving data on the drive is (obviously) not a concern.
Testing modern hard drives is complicated by the fact that the onboard controller manages defects internally. When the drive writes a sector of data, it reads it back to verify that it can be done correctly. If it can't, it locates a spare sector and tries to write there instead. Once it finds a good spare sector, it records in its internal data structures that it has remapped the sector to the good location.
The upshot of all of this is that you have not one clue that there is a bad sector, because the drive lies about it. Any utility to test a HD needs to know how to tell the drive to stop lying, which may vary between manufacturers or even models for all I know. That's operating at a VERY low level, underneath what the device driver would be doing, so it's going to require kernel-mode hackery and either a special kernel like memtest, or a special driver module that can be loaded to provide that level of access to the drive. If that's even possible.
I've heard good things about SpinRite, but it's far from free as in beer, and is written in DOS to be booted and run from a floppy or CD. Maybe someone will write a free utility that provides all that functionality, even after Linux has loaded drivers to talk to the lying hard drives. And ponies.