Last updated on October 6, 2020 by Gabriel Cánepa
Redundant Array of Independent Disks (RAID) is a storage technology that combines multiple hard disks into a single logical unit to provide fault-tolerance and/or improve disk I/O performance. Depending on how data is stored in an array of disks (e.g., with striping, mirroring, parity, or any combination thereof), different RAID levels are defined (e.g., RAID-0, RAID-1, RAID-5, etc). RAID can be implemented either in software or with a hardware RAID card. On modern Linux, basic software RAID functionality is available by default.
In this post, we'll discuss the software setup of a RAID-1 array (also known as a mirroring array), where identical data is written to the two devices that form the array. While it is possible to implement RAID-1 with partitions on a single physical hard drive (as with other RAID levels), it won't be of much use if that single hard drive fails. In fact, that's why most RAID levels normally use multiple physical drives to provide redundancy. In the event of any single drive failure, the virtual RAID block device should continue functioning without issues, and allow us to replace the faulty drive without significant production downtime and, more importantly, with no data loss. However, it does not replace the need to save periodic system backups in external storage.
Since the actual storage capacity (size) of a RAID-1 array is the size of the smallest drive, normally (if not always) you will find two identical physical drives in RAID-1 setup.
The tool that we are going to use to create, assemble, manage, and monitor our software RAID-1 is called
mdadm (short for multiple disks admin). On Linux distros such as Fedora, CentOS, RHEL or Arch Linux,
mdadm is available by default. On Debian-based distros,
mdadm can be installed with
mdadm comes pre-installed, all you have to do is to start RAID monitoring service, and configure it to auto-start upon boot:
# systemctl start mdmonitor # systemctl enable mdmonitor
For CentOS/RHEL 6 or earlier, use these commands instead:
# service mdmonitor start # chkconfig mdmonitor on
On Debian and its derivatives,
mdadm can be installed with
# aptitude install mdadm
On Ubuntu, you will be asked to configure
postfix MTA for sending out email notifications (as part of RAID monitoring). You can skip it for now.
On Debian, the installation will start with the following explanatory message to help us decide whether or not we are going to install the root filesystem on a RAID array. What we need to enter on the next screen will depend on this decision. Read it carefully:
Since we will not use our RAID-1 for the root filesystem, we will leave the answer blank:
When asked whether we want to start (reassemble) our array automatically during each boot, choose
Yes. Note that we will need to add an entry to the /etc/fstab file later in order for the array to be properly mounted during the boot process as well.
Now it's time to prepare the physical devices that will be used in our array. For this setup, I have plugged in two 8 GB USB drives that have been identified as
# dmesg | less
[ 60.014863] sd 3:0:0:0: [sdb] 15826944 512-byte logical blocks: (8.10 GB/7.54 GiB) [ 75.066466] sd 4:0:0:0: [sdc] 15826944 512-byte logical blocks: (8.10 GB/7.54 GiB)
We will use
fdisk to create a primary partition on each disk that will occupy its entire size. The following steps show how to perform this task on
/dev/sdb, and assume that this drive hasn't been partitioned yet (otherwise, we can delete the existing partition(s) to start off with a clean disk):
# fdisk /dev/sdb
p to print the current partition table:
(if one or more partitions are found, they can be deleted with
d option. Then
w option is used to apply the changes).
Since no partitions are found, we will create a new primary partition [
n] as a primary partition [
p], assign the partition number = [
1] to it, and then indicate its size. You can press Enter key to accept the proposed default values, or enter a value of your choosing, as shown in the image below.
Now repeat the same process for
If we have two drives of different sizes, say 750 GB and 1 TB for example, we should create a primary partition of 750 GB on each of them, and use the remaining space on the bigger drive for another purpose, independent of the RAID array.
Once you are done with creating the primary partition on each drive, use the following command to create a RAID-1 array:
# mdadm -Cv /dev/md0 -l1 -n2 /dev/sdb1 /dev/sdc1
The above command is equivalent to:
# mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1
If alternatively you want to add a spare device in order to replace a faulty disk in the future, you can add
--spare-devices=1 /dev/sdd1 to the above command.
y when prompted if you want to continue creating an array, then press Enter:
You can check the progress with the following command:
# cat /proc/mdstat
Another way to obtain more information about a RAID array (both while it's being assembled and after the process is finished) is:
# mdadm --query /dev/md0 # mdadm --detail /dev/md0 (or mdadm -D /dev/md0)
Of the information provided by '
mdadm -D', perhaps the most useful is that which shows the state of the array. The active state means that there is currently I/O activity happening. Other possible states are clean (all I/O activity has been completed), degraded (one of the devices is faulty or missing), resyncing (the system is recovering from an unclean shutdown such as a power outage), or recovering (a new drive has been added to the array, and data is being copied from the other drive onto it), to name the most common states.
The next step is formatting (with ext4 in this example) the array:
# mkfs.ext4 /dev/md0
Now let's mount the array, and verify that it was mounted correctly:
# mount /dev/md0 /mnt # mount
mdadm tool comes with RAID monitoring capability built in. When
mdadm is set to run as a daemon (which is the case with our RAID setup), it periodically polls existing RAID arrays, and reports on any detected events via email notification or
syslog logging. Optionally, it can also be configured to invoke contingency commands (e.g., retrying or removing a disk) upon detecting any critical errors.
mdadm scans all existing partitions and MD arrays, and logs any detected event to
/var/log/syslog. Alternatively, you can specify devices and RAID arrays to scan in
mdadm.conf located in
/etc/mdadm/mdadm.conf (Debian-based) or
/etc/mdadm.conf (Red Hat-based), in the following format. If
mdadm.conf does not exist, create one.
DEVICE /dev/sd[bcde]1 /dev/sd[ab]1 ARRAY /dev/md0 devices=/dev/sdb1,/dev/sdc1 ARRAY /dev/md1 devices=/dev/sdd1,/dev/sde1 ..... # optional email address to notify events MAILADDR [email protected]
mdadm configuration, restart
# service mdadm restart
# systemctl restart mdmonitor
# service mdmonitor restart
Now we will add an entry in the
/etc/fstab to mount the array in
/mnt automatically during boot (you can specify any other mount point):
# echo "/dev/md0 /mnt ext4 defaults 0 2" >> /etc/fstab
To verify that mount works okay, we now unmount the array, restart
mdadm, and remount. We can see that
/dev/md0 has been mounted as per the entry we just added to /etc/fstab:
# umount /mnt # service mdadm restart (on Debian, Ubuntu or Linux Mint) or systemctl restart mdmonitor (on Fedora, CentOS/RHEL7) or service mdmonitor restart (on CentOS/RHEL6) # mount -a
Now we are ready to access the RAID array via
/mnt mount point. To test the array, we'll copy the
/etc/passwd file (any other file will do) into
On Debian, we need to tell the
mdadm daemon to automatically start the RAID array during boot by setting the
AUTOSTART variable to true in the
We will simulate a faulty drive and remove it with the following commands. Note that in a real life scenario, it is not necessary to mark a device as faulty first, as it will already be in that state in case of a failure.
First, unmount the array:
# umount /mnt
Now, notice how the output of
mdadm -D /dev/md0 indicates the changes after performing each command below.
# mdadm /dev/md0 --fail /dev/sdb1 #Marks /dev/sdb1 as faulty # mdadm --remove /dev/md0 /dev/sdb1 #Removes /dev/sdb1 from the array
Afterwards, when you have a new drive for replacement, re-add the drive again:
# mdadm /dev/md0 --add /dev/sdb1
The data is then immediately started to be rebuilt onto
Note that the steps detailed above apply for systems with hot-swappable disks. If you do not have such technology, you will also have to stop a current array, and shutdown your system first in order to replace the part:
# mdadm --stop /dev/md0 # shutdown -h now
Then add the new drive and re-assemble the array:
# mdadm /dev/md0 --add /dev/sdb1 # mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1
Hope this helps.