[LFCS] Managing Software RAID

mdadm is a super cool command in Linux used to manage MD devices aka Linux Software RAID. Before we jump in, let’s see what is RAID. - --Redunant array of independent disks
--if disk gets corrupted, then data loss
--using RAID, if one disk fails, other will take over

$ man page says this,

RAID devices are virtual devices created from two or more real block devices. This allows multiple devices (typically disk drives or partitions thereof) to be combined into a single device to hold (for example) a single filesystem. Some RAID levels include redundancy and so can survive some degree of device failure.

Understanding RAID soln,
RAID O- Striping { one big device based on multiple disk, no redundancy or easy recovery}
RAID 1- Mirroring { 2 disks, identical }
RAID 5- striping with distributed parity { if data is written with parity info, if one disks fails, then restore the data }
RAID 6- striping with dual distributed parity { redundant parity is written,advancement of RAID 5 }
RAID 10- mirrored and striped { minimum of 10 disks, 2 for striping, 2 for mirrored}

Sample question: How to create a RAID 5 Device using 3 disk device of 1 GB each. Also allocate additional device as spare device.
-- Put a file system on it and mount it on /raid
-- Fail one of the devices, monitor what is happening
-- Replace the failed device with spare device

Solution:

$ cat /proc/partitions
$ fdisk -l { list the partition tables for the device, if no device specified then list all the partitions from the system }
$ fdisk /dev/sdc -n create a new partitions { size as +1G }
-m {help}
-t :L {enter "fd" for Linux raid auto}
-w { write the
entries to persist }
$ partprobe { inform the OS partition table changes }
$ vim /etc/fstab { before we proceed, let’s verify the disks are not used for any mounting. In my case, I had used as swap device mounting so got an error saying device is busy error. Rmv the entry, reboot }
$ mdadm --create /dev/md1 -l 5 -x 1 --raid-disk=3 /dev/sdc1 /dev/sdc2 /dev/sdc3 /dev/sdc4 --verbose --auto=yes
$ mdadm --detail /dev/md1 { list details after creation, should see 3 device + 1 spare device }

$ mdadm --fail dev/md1 /dev/sdc1 { to simulate the failure }
$ mdadm --remove /dev/md1 /dev/sdc1 { remove the faulty one }
$ mdadm --add /dev/md1 /dev/sdc1 { add the device back to the pool as spare device if healthy }

other disk related commands,
$ blkid $ blkid /dev/sdc
$ df -h, df -h -T, df -hT /home
$ du -h /home, du -sh /home/mydir
$ mount /dev/sdc5 /mnt, cd /mnt , touch file1 { after mounting make entry in /etc/fstab to persist}
$ mount -a { to mount all fs mentioned in fstab}
$ mkfs.ext4 /dev/sda4 { format a partition of type ext4, after creating a partition }

Command output:

root@mikky100:~# mdadm --fail /dev/md1 /dev/sdc1 { Simulate the failure }
mdadm: set /dev/sdc1 faulty in /dev/md1

root@mikky100:~# mdadm --detail /dev/md1 { view the detail after the failure, we should see the spare disk getting rebuild }
/dev/md1:
Version : 1.2
Creation Time : Mon Jun 11 06:10:34 2018
Raid Level : raid5
Array Size : 1951744 (1906.32 MiB 1998.59 MB)
Used Dev Size : 975872 (953.16 MiB 999.29 MB)
Raid Devices : 3
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Mon Jun 11 17:06:09 2018
State : clean, degraded, recovering
Active Devices : 2
Working Devices : 3
Failed Devices : 1
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 512K

Rebuild Status : 3% complete

Name : mikky100:1 (local to host mikky100)
UUID : 772f743c:b1209727:6910411d:690d6294
Events : 20

Number Major Minor RaidDevice State
3 8 36 0 spare rebuilding /dev/sdc4
1 8 34 1 active sync /dev/sdc2
4 8 35 2 active sync /dev/sdc3

0 8 33 - faulty /dev/sdc1

root@mikky100:~# mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Mon Jun 11 06:10:34 2018
Raid Level : raid5
Array Size : 1951744 (1906.32 MiB 1998.59 MB)
Used Dev Size : 975872 (953.16 MiB 999.29 MB)
Raid Devices : 3
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Mon Jun 11 17:08:13 2018
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : mikky100:1 (local to host mikky100)
UUID : 772f743c:b1209727:6910411d:690d6294
Events : 37

Number Major Minor RaidDevice State
3 8 36 0 active sync /dev/sdc4
1 8 34 1 active sync /dev/sdc2
4 8 35 2 active sync /dev/sdc3

0 8 33 - faulty /dev/sdc1

root@mikky100:~# mdadm --add /dev/md1 /dev/sdc1 { add the disk back as spare }

root@mikky100:~# mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Mon Jun 11 06:10:34 2018
Raid Level : raid5
Array Size : 1951744 (1906.32 MiB 1998.59 MB)
Used Dev Size : 975872 (953.16 MiB 999.29 MB)
Raid Devices : 3
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Mon Jun 11 17:12:21 2018
State : clean
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 512K

Name : mikky100:1 (local to host mikky100)
UUID : 772f743c:b1209727:6910411d:690d6294
Events : 39

Number Major Minor RaidDevice State
3 8 36 0 active sync /dev/sdc4
1 8 34 1 active sync /dev/sdc2
4 8 35 2 active sync /dev/sdc3

5 8 33 - spare /dev/sdc1