How to spoil your day

Sunday evening, I was in the data center. Hagrid had a failed sda in the RAID array, since post-Christmas (when I was on vacation), and I was going to replace it. Thanks to RAID1, it still kept humming along. Its almost impossible to find 120GB disks any longer, so I thought it would be time to upgrade to 2*500GB.

Monday, shone on me (smartctl -d ata -a /dev/sda):
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.

Again, sda was at it. A brand new, honking, 500GB SATA2 disk, failing. Power supply? Fubared motherboard? Now my thoughts of buying a Macbook or whatever new-fangled device Apple launches on January 15 at MacWorld, is clearly gone down the drain. I’m guessing a new server is in order. Well, at least it will be 64-bit, and every bit capable of running Xen.

In case anyone’s looking for a good reference to S.M.A.R.T. error messages, the Wikipedia entry on S.M.A.R.T. is pretty good.

Technorati Tags: , ,

2 Comments

  1. Don McArthur says:

    Bah. I’ve got an old IDE Fujitsu hard drive that SMART has been warning me about for almost 3 years, thru intallations of CentOS, Fedora and two versions of Ubuntu. Running on an Intel mobo, with some goofy flash anti-virus, which I think is the cause. Ignore it, do backups, have faith, soldier on.

  2. byte says:

    Too late Don. Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

    And dmesg has filled it up with relevant IO errors, to make it useless.

    No soldiering on this time (being a server and all, with limited access). Sigh


i