I need some help being pointed to a definitive guide on how to repair a degraded array. I thought I knew the process, but, it turns out, I don’t.
I had a drive fail on my Raid 1 array. I started the array with the single disk and made a current backup of the data. After several days of trying different things and a long string of errors, I ended up just deleting the array, the partitions - everything - and starting over from scratch. Which is definitely the wrong way to go about things.
So, I need either a definitive guide or to confirm/correct the steps below in order to fix this in the future. In the steps below, I’ll be using md0 as the array, sda as the bad drive, and sde as the replacement. Also making the assumption that the entire drive was used in the array instead of a partition.
- Once you realize there is a problem with an array: DO NOT RESTART THE COMPUTER.
- Check the status of the array: cat /proc/mdstat
- Check the detail of the array: mdadm --detail /dev/md0
- Fail the bad drive: mdadm /dev/md0 --fail /dev/sda
- Remove the bad drive from the array: mdadm /dev/md0 --remove /dev/sda
- Add the replacement drive: mdadm /dev/md0 --add /dev/sde
Once the sync is complete, all should be good. Confirm by checking the status of the array.
Does anything need to be changed in mdadm.config or update fstab?
What am I missing?