Is my harddisk (almost) dead?

I just started to play with HD Tune, just to get a tool to look at my S.M.A.R.T. data. Not that I trust SMART a lot, but I wanted to see what's there.

So that's how I discovered my main drive has a bunch of reallocated sectors. Ouch!

HDTune_Health_WDC_WD2500JD-00HBC0

What is a reallocated sector? Whenever a sector becomes "bad" for some reason, modern harddisks are "remapping" this sector to some other location of the disk. That's why you almost never see "bad clusters" on a modern harddisk. See for example the result of the CHKDSK command on the same drive - it says that everything is fine:

C:\Windows\system32>chkdsk c:
The type of the file system is NTFS.

WARNING!  F parameter not specified.
Running CHKDSK in read-only mode.

CHKDSK is verifying files (stage 1 of 3)...
  118464 file records processed.
File verification completed.
  129 large file records processed.
  0 bad file records processed.
  2 EA records processed.
  75 reparse records processed.
CHKDSK is verifying indexes (stage 2 of 3)...
  460369 index entries processed.
Index verification completed.
  5 unindexed files processed.
CHKDSK is verifying security descriptors (stage 3 of 3)...
  118464 security descriptors processed.
Security descriptor verification completed.
  16840 data files processed.
CHKDSK is verifying Usn Journal...
  35955328 USN bytes processed.
Usn Journal verification completed.
Windows has checked the file system and found no problems.

244196351 KB total disk space.
104969600 KB in 101394 files.
     50680 KB in 16841 indexes.
         0 KB in bad sectors.
    227651 KB in use by the system.
     65536 KB occupied by the log file.
138948420 KB available on disk.

      4096 bytes in each allocation unit.
  61049087 total allocation units on disk.
  34737105 allocation units available on disk.

There are two problems with reallocated sectors. First, each harddisk comes with a limited "pool" of empty sectors that can be used as reallocated sectors. When you run out of those, then the automatic protection goes away, so you will start seeing more and more bad sectors at the OS level.

Second (and most importantly) there is a performance problem with reallocated sectors. Due to the fact that some sectors are remapped to another area on the disk, sequential I/O on those sectors is getting randomized (becomes random I/O) with very different performance characteristics. How big is the performance impact? Pretty big. Let's say that you have a no reallocated sectors in a 40 MB interval. If you want to read 1 MB in this interval, you will read it at a standard sequential I/O rate, say about 50 MB/s on a regular SATA disk. So reading will take 1/50 = 20 ms.

Now, let's pretend we have a reallocated sector in the middle of the 1 MB that we want to read. The reading time will include those 20 ms above plus the two additional seeks. Since a seek is usually around 8-9 ms (or even more) for a standard SATA harddisk, you get around 40 ms for reading the 1 MB which is twice as long. So, from a bandwidth perspective, you are reading that 1 MB at 25 MB per second, which half of the actual speed. So what might see is a sudden decrease in sequential I/O bandwidth whenever there is a reallocated sector around.

The interesting thing is that you can "spot" the approximate location of reallocated sectors by doing a sequence of sequential reads over the entire harddisk (from the beggining to end) and see where you have sudden drops in I/O performance. Fortunately, the same tool mentioned above - HD Tune - has a benchmark mode which allows precisely this.

Here is the picture of my backup drive (a Hitachi 250 GB drive):

HDTune_Benchmark_HDS722525VLSA80

You can see that the transfer rate decreases smoothly from 60 MB/s (near to the outer region of the disk) to 30 MB/s (near to the center of the disk). There are no sudden drops. In contrast, here is a disk with a lot of reallocated sectors:

HDTune_Benchmark_TOSHIBA_MK4025GAS

You can see a lot of drops along the blue sequential read path, with a variability of about 50% in some cases.

Note - if you are doing these sequential I/O tests, make sure that your harddisk is not in use by any other applications or the page file - this will cause additional seeks which would "pollute" the original graph.

For that reason, I haven't run yet the benchmark on my main drive since I know that paging will distort the results - I'll try this later ...