texSoft.it - Linux e RAID software, ripristino con raidhothadd

It happens, be prepared!

Yes, sometimes what (or who) you trust betraies you! And for hard disks it happens too, more often than expected. But the risk of data (and money) loss ca be dramatically reduced by using some kind of redundancy, i.e. RAID arrays. And it works!

Here i report my experience on my lan server, on which i setup two 120 Gb HDs configured as a RAID-1 (mirroring) array using Linux software-raid.

System configuration

M/B ABIT KT7A-RAID
(with HighPoint HPT370 onboard IDE controller)

Processor AMD Duron 850Mhz

Memory 640 Mb SDRAM

Storage IBM-DJNA-371350
QUANTUM FIREBALL CR8.4A
Maxtor 6Y120L0
Maxtor 6Y120P0

System SuSE Linux 8.1
Linux version 2.4.19-4GB

The DJNA is used as system disk, while the two Maxtor units are configured in mirroring using RAID-1. Originally they were two 6Y120L0, bought together, looks like from a unlucky serie, as both died, the first after just a week (and I replaced it with the 6Y120P0), the second after 1 year and 10 days. Here is how /etc/raidtab looks like:

raiddev /dev/md0
   raid-level       1
   nr-raid-disks    2
   nr-spare-disks   0
   persistent-superblock 1
   chunk-size        4
   device   /dev/hde1
   raid-disk 0
   device   /dev/hdg1
   raid-disk 1

Both disks are partitioned into one big partition of type 0xfd (raid autodetect):

Disk /dev/hdg: 16 heads, 63 sectors, 238216 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdg1             1    238216 120060832+  fd  Linux raid autodetect

Would have been wiser not to use the whole disk space for the partition, keeping it some Mbs smaller. Suppose a disk fails, but it's impossible to find another one of the same model, another disk of the same size but different manufacturer or serie could be not exaclty of the same size, and due to the different geometry it may be a bit smaller, avoiding us to use it to replace the failed one, and forcing us to buy a bigger (and more expensive) one.

A partition to add as spare disk to a RAID-1 array must have a size >= of the size of the existing RAID-1 array. Partitions involved into a RAID-1 array can be of different size, but the size of the resulting device will be equal to the size of the smallest one, so we can add a partition bigger that the array's size to it, but not smaller.

One of the two hard disk of the array (/dev/hde) crashed abruptly, with mechanical noise. The fault was deteced by the system and the disk was removed from the array. The RAID device continued to work, but in degraded mode, with only one disk. If a spare disk would have been available, reconstruction of the array would have bagun at once.

There is the sequence of messages from syslog since the detection of disk failure to the to the removal of the failed disk from the array:

Oct 17 17:24:02 starbase kernel: hde: timeout waiting for DMA
Oct 17 17:24:02 starbase kernel: hde: 0 bytes in FIFO
Oct 17 17:24:02 starbase kernel: ide_dmaproc: chipset supported ide_dma_timeout func only: 14
Oct 17 17:24:02 starbase kernel: hde: status timeout: status=0xd0 { Busy }
Oct 17 17:24:02 starbase kernel: hde: drive not ready for command
Oct 17 17:24:37 starbase kernel: ide2: reset timed-out, status=0x80
Oct 17 17:24:37 starbase kernel: hde: status timeout: status=0x80 { Busy }
Oct 17 17:24:37 starbase kernel: hde: drive not ready for command
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199141632
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199141640
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199141648
...
...
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199143760
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199143768
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 119544
Oct 17 17:25:07 starbase kernel: raid1: hde1: rescheduling block 119544
Oct 17 17:25:07 starbase kernel: md: updating md0 RAID superblock on device
Oct 17 17:25:07 starbase kernel: md: hdg1 [events: 00000277]<6>(write) hdg1's sb offset: 120060736
Oct 17 17:25:07 starbase kernel: md: recovery thread got woken up ...
Oct 17 17:25:07 starbase kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
Oct 17 17:25:07 starbase kernel: md: recovery thread finished ...
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199143256
Oct 17 17:25:07 starbase kernel: end_request: I/O error, dev 21:01 (hde), sector 199143776
Oct 17 17:25:08 starbase kernel: md: (skipping faulty hde1 )
Oct 17 17:25:08 starbase kernel: raid1: hdg1: redirecting sector 119544 to another mirror

The system continued to work perfectly and the fault was managed transparently. The /proc/mdstat reported only one disk was working in the array:

Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdg1[1]
      120060736 blocks [2/1] [_U]

At this point the system was shut down, and the faulty drive has been replaced. The down time of the system has been just the time needed to replace the broken disk, that is just 5 minutes.

After the reboot the RAID unit was working perfectly the same, in degraded mode. A surface scan of the new disk has benn made to check it with:

badblocks -sv /dev/hde

The next step was to add repartition the disk with a partition of same size and type of the other disk (look above). And then the partition was added to the RAID array with the command:

raidhotadd /dev/md0 /dev/hde1

The disk was integrated into the array and reconstruction began immediatly. Meanwhile the RAID device was still perfectly working and usable. The /proc/mdstat reported the progress of the reconstruction. Here there are some shots of the process:

Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hde1[2] hdg1[1]
      120060736 blocks [2/1] [_U]
      [>....................]  recovery =  0.8% (981952/120060736) finish=70.5min speed=28134K/sec


Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hde1[2] hdg1[1]
      120060736 blocks [2/1] [_U]
      [===>.................]  recovery = 19.4% (23349504/120060736) finish=56.8min speed=28343K/sec


Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hde1[2] hdg1[1]
      120060736 blocks [2/1] [_U]
      [================>....]  recovery = 82.9% (99550592/120060736) finish=11.9min speed=28667K/sec

Once the reconstruction process finished the RAID configuration on superblocks of devices partecipating to the array was updated, and the new disk integrated into the array:

Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hde1[0] hdg1[1]
      120060736 blocks [2/2] [UU]

unused devices: <none>

Here are extract from the syslog generated by the raid daemon during the reconstruction of the array:

Oct 19 00:18:56 starbase kernel: md: trying to hot-add hde1 to md0 ...
Oct 19 00:18:56 starbase kernel: md: bind<hde1,2>
Oct 19 00:18:56 starbase kernel: RAID1 conf printout:
Oct 19 00:18:56 starbase kernel:  --- wd:1 rd:2 nd:1
Oct 19 00:18:56 starbase kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdg1
Oct 19 00:18:56 starbase kernel:  disk 2, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
...
...
Oct 19 00:18:56 starbase kernel:  disk 25, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 26, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel: md: updating md0 RAID superblock on device
Oct 19 00:18:56 starbase kernel: md: hde1 [events: 00000282]<6>(write) hde1's sb offset: 120060736
Oct 19 00:18:56 starbase kernel: md: hdg1 [events: 00000282]<6>(write) hdg1's sb offset: 120060736
Oct 19 00:18:56 starbase kernel: md: recovery thread got woken up ...
Oct 19 00:18:56 starbase kernel: md0: resyncing spare disk hde1 to replace failed disk
Oct 19 00:18:56 starbase kernel: RAID1 conf printout:
Oct 19 00:18:56 starbase kernel:  --- wd:1 rd:2 nd:2
Oct 19 00:18:56 starbase kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdg1
Oct 19 00:18:56 starbase kernel:  disk 2, s:1, o:0, n:2 rd:2 us:1 dev:hde1
Oct 19 00:18:56 starbase kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
...
...
Oct 19 00:18:56 starbase kernel:  disk 25, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel:  disk 26, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 00:18:56 starbase kernel: md: syncing RAID array md0
Oct 19 00:18:56 starbase kernel: md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.
Oct 19 00:18:56 starbase kernel: md: using maximum available idle IO bandwith
                                 (but not more than 100000 KB/sec) for reconstruction.
Oct 19 00:18:56 starbase kernel: md: using 508k window, over a total of 120060736 blocks.
...
...
Oct 19 01:31:27 starbase kernel: md: md0: sync done.
Oct 19 01:31:27 starbase kernel: RAID1 conf printout:
Oct 19 01:31:27 starbase kernel:  --- wd:1 rd:2 nd:2
Oct 19 01:31:27 starbase kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Oct 19 01:31:27 starbase kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdg1
Oct 19 01:31:27 starbase kernel:  disk 2, s:1, o:1, n:2 rd:2 us:1 dev:hde1
Oct 19 01:31:27 starbase kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 01:31:27 starbase kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
...
...
Oct 19 01:31:27 starbase kernel:  disk 25, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 01:31:27 starbase kernel:  disk 26, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Oct 19 01:31:27 starbase kernel: md: updating md0 RAID superblock on device
Oct 19 01:31:27 starbase kernel: md: hde1 [events: 00000283]<6>(write) hde1's sb offset: 120060736
Oct 19 01:31:27 starbase kernel: md: hdg1 [events: 00000283]<6>(write) hdg1's sb offset: 120060736
Oct 19 01:31:27 starbase kernel: md: recovery thread finished ...

As last step a fsck was done on the filesystem on the RAID device, which showed no filesystem corruption.

The whole recovery process was straight forward and painless. The system was down just the time to replace the failed disk, while the RAID device sill was working perfectly during the whole process.

RAID can help a lot in avoiding data loss, but it's not the solution to any problems and a good backup policy must always be adopted.