Using Linux Software RAID and LVM to build your own NAS

** Update 2014 **
I am transitioning away from RAID arrays (and no longer use LVM). For smaller amounts of data I am migrating to btrfs (RAID 10 or RAID 6 and larger amounts of data a cluster (using ceph or swift) is usually the answer.
** End Update **

I am a big fan of RAID arrays and I use them everywhere. After having been bitten by failed, end of life hardware devices, I have become a fan of the software RAID. In particular I am quite happy with the Linux software RAID capabilities when paired with the Logical Volume Manager (LVM).

First, I create some partitions on the drives I want to use, in my case, I have four 1TB drives that I am putting into an array. You do not need to create partitions as Linux can use the whole drive as it is, where you may have issues is if you plug the drive into a computer that does not support Linux RAID, it may prompt you to format what appears to be a blank drive. If you use partitions, at least the OS will see an unsupported partition and probably do nothing. This is not a big deal but it can help to prevent you from shooting yourself in the foot, especially when you have more than a few drives on the workbench and more than one OS in daily use.

One other use for partitions is that you can play with this technology by using a single drive. You can create an array from a couple of partitions on the same drive. While this is not really useful in the real world (there is no redundancy and you are therefore not protected against data loss), it is still handy for learning about this technology.

I use cfdisk but any partitioning tool will work. Once we have our drives partitioned we can go ahead and create the array. For this example I created a single partition on each drive (in the array) that spanned the entire drive.

  1. sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 /dev/sda1/dev/sdc1 /dev/sdd1 /dev/sde1

To see the results, we use mdadm again:

  1. sudo mdadm --detail /dev/md0

Here are the results on my system:

  1. /dev/md0:
  2.         Version : 00.90.03
  3.   Creation Time : Wed May 27 16:07:44 2009
  4.      Raid Level : raid5
  5.      Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
  6.   Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
  7.    Raid Devices : 4
  8.   Total Devices : 4
  9. Preferred Minor : 0
  10.     Persistence : Superblock is persistent
  11.     Update Time : Thu Jul 23 12:24:41 2009
  12.           State : clean
  13.  Active Devices : 4
  14. Working Devices : 4
  15.  Failed Devices : 0
  16.   Spare Devices : 0
  18.          Layout : left-symmetric
  19.      Chunk Size : 64K
  21.            UUID : d7dc90fe:ef8d763c:e30e8841:87878c43 (local to host phoenix)
  22.          Events : 0.32
  24.     Number   Major   Minor   RaidDevice State
  25.        0       8        1        0      active sync   /dev/sda1
  26.        1       8       33        1      active sync   /dev/sdc1
  27.        2       8       49        2      active sync   /dev/sdd1
  28.        3       8       65        3      active sync   /dev/sde1

What we have done is to create a device /dev/md0 that you can use like any other block device. You could at this point format the device (this is in fact what a lot of simple NAS devices like the DNS-323 do), but instead I will layer a logical volume on top. One of the things that this allows us to do is to resize the array should we add capacity in the future. LVM has a lot of other features, it is worth reading their documentation to see what it can do.

  1. sudo pvcreate /dev/md0
  2. sudo vgcreate -s 16M lvm-raid /dev/md0

pvcreate initializes the disk, partition, or array for use with LVM. vgcreate actually creates and names the volume group, in this case we are calling it "lvm-raid".

The "-s 16M" sets the physical extent size to 16 megabytes. You only get 65,000 PEs in total, so to figure out the size of the PE you take the total size of your array in megabytes and divide by 65000. You then round up to the largest power of 2 (4, 8, 16, 32 etc.) in megabytes.

Now we need to create the volume within the group that we will actually be putting our file system on. In the future if I wanted to grow the logical volume in size, I could add another array and then add the new array to this volume group. I could create a new logical volume or simply extend the existing one. Anyway, we need to create that logical volume.

  1. sudo vgdisplay lvm-raid
  2. Free  PE / Size 178850 / 2.73 TB

I want to use all of the available space:

  1. sudo lvcreate -l 178850 lvm-raid -n lvm0

I now have a logical volume called lvm0 that I can put a file system on to.

  1. sudo lvdisplay /dev/lvm-raid/lvm0
  2.   --- Logical volume ---
  3.   LV Name                /dev/lvm-raid/lvm0
  4.   VG Name                lvm-raid
  5.   LV UUID                VnXNbn-SY0A-A2H8-EuJr-021p-GIU0-pTYovm
  6.   LV Write Access        read/write
  7.   LV Status              available
  8.   # open                 1
  9.   LV Size                2.73 TB
  10.   Current LE             178850
  11.   Segments               1
  12.   Allocation             inherit
  13.   Read ahead sectors     0
  14.   Block device           254:0

  1. sudo mkfs.ext3 /dev/lvm-raid/lvm0

Lets look at the array now that I have copied a little bit of data over to it:

  1. $ df -h /opt/data
  2. Filesystem            Size  Used Avail Use% Mounted on
  3. /dev/mapper/lvm--raid-lvm0
  4.                       2.8T   84G  2.5T   4% /opt/data

That about wraps it up. All of the info in this document can be found elsewhere on the net.
Some of the other sources go into even greater detail. Some of the specific sources that I used were:

As always, google or clusty are your friends.

Do not worry if this is a little confusing, you can find an in depth description of LVM here: