Over the years, I have formed the habit of keeping an up-to-date backup copy of all my critical data on a regular basis. But due to some project deadlines, I have given myself a good excuse not to follow that routine recently. After all, I haven’t lost any data due to hard drive failures for more than ten years now and what are the chances that my trust-worthy Western Digital WD2500JD hard drive would fail me? So, I have not made any backups for the past two months.

And of course, it happened. Last Thursday evening after installing some routine system updates, I rebooted my computer. But instead of the familiar log-in screen, I was greeted by a seemingly endless reboot. At first, I thought that maybe the latest security updates had messed up something so I hit F8 and tried to boot into safe mode. But again, the computer kept rebooting half-way through the startup process. Since I had Windows XP installed on another hard drive, I booted up XP trying to figure out the problem. And it seemed that the Windows 2003 Server drive had failed. The system seemed to unable to identify the hard drive and I could here the dreadful clicking sound from the Windows 2003 drive.

I powered off my computer immediately. The first thought came across my mind was to create a disk image of the failed drive so that I could always get back to the state right after it failed.

To do this, I booted into SUSE Linux (yet another OS on a third hard drive), and used dd_rescue to create an disk image onto a 320GB external hard drive. Luckily, the drive image was created without any fuss. I then tried to mount the failed disk normally to /windows/D, but it was unsuccessful. After getting a bunch of error messages, the partitioned seemed to be mounted. But the system was only seeing 14G of empty space. I then tried to mount the partition as read-only. After some delay and clicking sound, I was able to do an "ls" and see all the root level contents.

Encouraged, I decided to try to copy everything off the damaged partition to another hard drive in the order of importance. It turned out that the copying speed was extremely slow (averaging around 1MB per minute), every file seek seemed to be accompanied by multiple clicks. Given the speed, it would take three months to totally copy everything off the drive! Luckily, there were only 4GB or so new or modified files since my last backup 2 months ago and till this morning I was able to copy all of those data out.

So, in the end, I did not loose any data. I was lucky enough this time as it seemed that the physical damage occurred at the outer tracks of the drive, where the operating system was. The inner tacks where all my important data was located was largely intact.

The lesson of my experience is that remember to back up your systems regularly, and do not take chances. Modern disks have much higher areal density and you could easily loose gigabytes of data should a drive failure occurs. And do not panic if your hard drive fails. Shut down your system immediately to avoid further damage to the drive. At the early stage of a drive failure, your chance of getting everything back is pretty good. One biggest mistake is to try the failed drive again and again. Doing so may corrupt more data and make a successful recovery more difficult. And of course, do not try to repartition or reformat the drive as doing so would almost certainly make data recovery much more difficult if not totally impossible. And last but not least, if vital data is at stake, you might want to send the drive to a professional data recovery facility as such places have more tools and experience in handling this kind of situation.

Be Sociable, Share!