February 19, 2013
Linux RAID with mdadm - Do's and Dont's
I've been keeping my personal data safe with linux software raids for almost a decade. I've even convinced many friends to do the same. Lost data is so frustrating... actually, loosing data was one of the forces that pushed me to abandon M$ windowz and become a daily linux user.
History
It was the early 2000’s and everyone was sharing multimedia files. The p2p networks had anything you could want and DVD writers + sneakernet allowed even the bandwidth poor to get anything from friends.
I bought a huge 160GB drive to act as the primary dumping ground for my new digital treasure and I filled it to the brim, or so I thought. While editing some videos with a friend I discovered chunks of other video files interspersed with the file I was editing. Turns out windowz was only using 32bit adressing on my disk and after I reached 128GB it started truncating the addresses and writing back over the other data while reporting that everything was OK.
I thought RAID 5 would be cool, until I realized how expensive and loud that would be. Taking a step back I decided that I could afford 2 drives in a mirrored configuration called RAID 1. I originally RAIDed my drives with the so-called raid controller on my motherboard. Bad Idea - it wasn’t a real raid controller. It used the CPU to do all the heavy lifting, the admin interface sucked and no other vendor’s controller wanted anything to do with those drives. Linux to the rescue!
Glorious Software RAID
Creating a software RAID in linux is easy and has allowed me to move the same pair of drives into multiple systems with different hardware and kernels without a problem for nearly a decade. Additionally, in a pinch I can mount each drive independently without any raid software at-all.
You may need to install mdadm
. In Ubuntu or Debian use:
sudo apt-get install mdadm
Preparing Your Drives
If your drives are brand new you can skip to step 2. For this example I'll call the drives /dev/sdX
and /dev/sdY
but replace them with the actual names/letters from your system. These instructions also assume that the entire drive will be used for your RAID.
1) Check for previous raid superblocks
If you get the following response, then you are likely in good shape.
> sudo mdadm --manage --examine /dev/sdX
mdadm: No md superblock detected on /dev/sdX.
But if you get the following, you'll need to do some cleanup first.
> sudo mdadm --manage --examine /dev/sdY
/dev/sdY:
Magic : a92b4efc
Version : 0.90.00
UUID : 2a321d73:92a29f89:91a9f934:c8ab7b11 (local to host)
Creation Time : Tue Jan 25 17:02:50 2011
Raid Level : raid1
Used Dev Size : 625131776 (596.17 GiB 640.13 GB)
Array Size : 625131776 (596.17 GiB 640.13 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Update Time : Sun Feb 10 11:39:41 2013
State : clean
Active Devices : 2
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : 83ef590e - correct
Events : 122
Number Major Minor RaidDevice State
this 1 8 48 1 active sync /dev/sdY
Remove the old superblock.
> sudo mdadm --misc --zero-superblock /dev/sdY
Erase all drive meta data.
> sudo dd if=/dev/zero of=/dev/sdY bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 1.86259 s, 56.3 MB/s
2) Create partitions
You'll want do these steps on both drives.
> sudo fdisk /dev/sdX
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0xe0fd90a7.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
> Command (m for help): o
Building a new DOS disklabel with disk identifier 0xacab143c.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
> Command (m for help): p
Disk /dev/sdc: 640.1 GB, 640133946880 bytes
255 heads, 63 sectors/track, 77825 cylinders, total 1250261615 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xacab143c
Device Boot Start End Blocks Id System
> Command (m for help): n
Partition type:
p primary (0 primary, 0 extended, 4 free)
e extended
> Select (default p): p
> Partition number (1-4, default 1): 1
> First sector (2048-1250261614, default 2048): 2048
Do not setup your partitions all the way to the last sector. I left 2 GiB at the end of my partition. I once had a drive run out of spare sectors. Rather than warning me, it truncated the data and resized it and it suddenly became too small to pair with the other drive. Choosing to leave even more freespace will help if you ever need to replace a failed disk with one that is not identical to the other. (see fixing a busted array)
As you can see above my sectors are each 512 bytes and I wanted 2GB free.
1024 * 1024 * 1024 * 2 = 2147483648 bytes to save
2147483648 / 512 = 4194304 sectors to save
last sector - sectors to save = end of the partition
1250261614 - 4194304 = 1246067310 last sector
> Last sector, +sectors or +size{K,M,G} (2048-1250261614, default 1250261614): 1246067310
> Command (m for help): t
Selected partition 1
> Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
> Command (m for help): p
Disk /dev/sdc: 640.1 GB, 640133946880 bytes
238 heads, 28 sectors/track, 187614 cylinders, total 1250261615 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xacab143c
Device Boot Start End Blocks Id System
/dev/sdcX 2048 1246067310 623032631+ fd Linux raid autodetect
> Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
Repeat this process for the second drive, but use the same last sector value that you used for the first drive so the partitions are identical.
Create the RAID array
1) Create the RAID array
Note: You want to setup the partitions in your raid, not the raw disk, so use /dev/sdX1
not /dev/sdX
.
> sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdX1 /dev/sdY1
mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store '/boot' on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
mdadm: size set to 622901376K
> Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
2) Format the partition
I used the default linux filesystem ext4
.
> sudo mkfs.ext4 -v /dev/md0
mke2fs 1.42 (29-Nov-2011)
fs_types for mke2fs.conf resolution: 'ext4'
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
38936576 inodes, 155725344 blocks
7786267 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
4753 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
Configure for use
1) Setup mdadm.conf
Each raid array is given a unique id which can be used by mdadm to setup the raid array during boot. You can use mdadm to get the unique id of your array.
>sudo mdadm --misc --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Feb 18 10:26:45 2013
Raid Level : raid1
Array Size : 622901376 (594.05 GiB 637.85 GB)
Used Dev Size : 622901376 (594.05 GiB 637.85 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Mon Feb 18 15:38:08 2013
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 449ec249:ca6af101:8ca61121:a4f427b9 <--- unique id
Events : 4394
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdX1
2 8 49 1 active sync /dev/sdY1
Update mdadm.conf
> sudo vim /etc/mdadm/mdadm.conf
Then in the array section I added the line:
ARRAY /dev/md0 449ec249:ca6af101:8ca61121:a4f427b9
2) Mount and Use It
I'll let you take it from here. Enjoy!
Don't forget /etc/fstab
Fixing a Busted RAID Array
As mentioned above, I once made the mistake of using the entire drive for my raid array and then found myself with 2 drives that would not assemble.
Below is how I fixed it.
1) Make a backup
If you can make an additional backup than do it. I didn't have enough space so I backed-up what I cared about most - but the process below should allow you to reconfigure your raid array without losing any data.
2) Free up one drive in the array
If your array is already degraded then you can skip to step 3 using the already rejected drive.
If your array has not yet failed, but you know it is flawed you can manually fail one of the drives. This will leave the array running in a degraded mode with only a single drive.
> sudo mdadm --manage /dev/md0 --fail /dev/sdX
3) Clean up and configure
Follow steps under Preparing Your Drives to cleanup and configure /dev/sdX
4) Create a new incomplete raid array
Using the newly prepped partition you can create a new raid array.
> sudo mdadm --create /dev/md1 -v --level=1 --raid-devices=2 /dev/sdX1 missing
mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store '/boot' on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
mdadm: size set to 622901376K
>Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
Format the new array as-well
> sudo mkfs.ext4 /dev/md1
5) Mount and copy the data
I’d mount the original array /dev/md0
readonly and the new array /dev/md1
readwrite. Then copy all the data from the old array to the new array.
6) Check everything is copied
I’m a little paranoid so I dumped the file paths and sizes of the data on each drive and compared them with diff
.
> sudo find mount_point/of_md0 -type f -printf "%P %s\n" | sort > drive_md0.txt
> sudo find mount_point/of_md1 -type f -printf "%P %s\n" | sort > drive_md1.txt
> diff -u drive_md0.txt drive_md1.txt
--- drive_md0.txt 2013-02-18 15:18:55.721848694 -0500
+++ drive_md1.txt 2013-02-18 15:19:30.909173089 -0500
@@ -33439,7 +33439,6 @@
afolder/file-x.jpg 57344
afolder/file-y.jpg 430729
afolder/file-z.jpg 46778
-.DS_Store 6148
exported/img-040.jpg 1442109
exported/img-180.jpg 1703897
exported/img-186.jpg 2062038
As you can see I managed to leave behind a hidden file .DS_Store
that I don’t care about.
7) Kill the original array
We are now going to kill the last disk from the original array. It’s ok if you triple check your backups, I did.
Unmount /dev/md0
then stop it
> sudo mdadm --manage --stop /dev/md0
mdadm: stopped /dev/md0
This will remove the last remaining drive /dev/sdY
from the original raid array.
8) Clean up and configure
Follow steps under Preparing Your Drives to cleanup and configure /dev/sdY
9) Add the other drive to the new array
This will add the drive so that it can begin syncing
> sudo mdadm /dev/md1 --manage --add /dev/sdY1
mdadm: added /dev/sdY1
> cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdX1[2] sdY1[0]
622901376 blocks super 1.2 [2/1] [U_]
[>....................] recovery = 0.7% (4703744/622901376) finish=93.7min speed=109830K/sec
10) Follow the steps in Configure for use
And update mdadm.conf
and /etc/fstab
to reference the new array.