Breaking a Linux Software RAID 1 for Import using VMware Converter

I rarely post super-geeky stuff on here, but since I couldn’t find any good instructions to do this important and critical activity on the Internet, and my co-workers had to piece together a set of steps that worked for us, I wanted to share what we learned, hoping to save the next person all of the work.

Linux has supported the idea of software RAID, particularly RAID 1 (or disk mirroring) for a long time. Disk mirroring is a great way to gain some insurance against a single disk failure bringing down a critical system, as everything rewritten to one disk is also written to the other disk. Many servers use hardware RAID, which mirrors the disks at a lower level than the operating system can see, making it easier to gain this redundancy. However, hardware RAID has always been more expensive than software RAID, and so there are quite a few servers out there using software RAID to protect their data.

This week, we had one of our last physical (non-virtual) server (RedHat Linux 4 AS) that needed to be virtualized. Due to the size of the data stored on that system and how it uses an external disk array, it was important that we virtualize it in place, using excellent VMware’s Converter Standalone to import the running machine, so that there was no downtime while importing the data. However, the Converter Standalone will not import Linux systems using software RAID, due to problems accessing the underlying data structures of the disk through the metadevices presented by the software RAID. (You know you are having this problem when the Converter complains about not being able to access the /boot partition.) The best solution was to break the mirrored software RAID and boot the system off of one disk, so that all of the necessary partitions could be imported and the system could be virtualized.

Unfortunately, as important and seemingly common as breaking a mirrored software RAID is in Linux, I couldn’t find any good, comprehensive, working instructions on how to do it, and breaking a software RAID is a tricky business. It is very, very easy to end up with a non-booting system and no easy way to repair it. So, to help out the next person that runs into it, I’m posting the steps that we did to break the mirrored software RAID and set the system to boot off of only one disk, so that VMware’s Converter Standalone would work on it.

First, get an idea of what you are dealing with. Logged in as root, inspect the system:

[root@cr2 cr]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md2 456G 45G 389G 11% /
/dev/md0 487M 35M 427M 8% /boot
none 4.0G 0 4.0G 0% /dev/shm
/dev/sda1 2.0T 560G 1.4T 30% /archive
[root@cr2 cr]# more /etc/fstab
# This file is edited by fstab-sync - see 'man fstab-sync' for details
/dev/md2 / ext3 defaults 1 1
/dev/md0 /boot ext3 defaults 1 2
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0
/dev/md1 swap swap defaults 0 0
/dev/sda1 /archive ext3 defaults 1 0
/dev/hda /media/cdrecorder auto pamconsole,exec,noauto,managed 0 0
[root@cr2 cr]# more /etc/mtab
/dev/md2 / ext3 rw 0 0
none /proc proc rw 0 0
none /sys sysfs rw 0 0
none /dev/pts devpts rw,gid=5,mode=620 0 0
usbfs /proc/bus/usb usbfs rw 0 0
/dev/md0 /boot ext3 rw 0 0
none /dev/shm tmpfs rw 0 0
/dev/sda1 /archive ext3 rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0

In this case, we have 3 metadevices (md0 as /boot, md1 as swap, and md2 as root /). You can get further details about your RAID configuration using the mdadm tool, as well as mdstat:

[root@cr2 cr]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdc2[1] sdb2[0]
 2200832 blocks [2/2] [UU]
md2 : active raid1 sdc3[1] sdb3[0]
 485668928 blocks [2/2] [UU]
md0 : active raid1 sdc1[1] sdb1[0]
 513984 blocks [2/2] [UU]
unused devices: <none>

You can also run these commands:

cat /proc/mdstat
mdadm --detail /dev/md2     (to inquire about the disk members of the /dev/md2 metadevice)

When you are finally ready to do this, make sure you have a full backup of your system, and then stop all running services, especially any that would possibly write data, such as databases.

Next, we need to use mdadm to set one of the disks (we chose /dev/sdc) to be set as “failed” and removed from the RAID array. To do this, we ran these commands:

mdadm --fail /dev/md2 /dev/sdc3
mdadm --remove /dev/md2 /dev/sdc3
mdadm --zero-superblock /dev/sdc3
mdadm --fail /dev/md1 /dev/sdc2
mdadm --remove /dev/md1 /dev/sdc2
mdadm --zero-superblock /dev/sdc2
mdadm --fail /dev/md0 /dev/sdc1
mdadm --remove /dev/md0 /dev/sdc1
mdadm --zero-superblock /dev/sdc1

At this point, the software RAID still exists, but the /dev/sdc disk has been removed from it. All of the data on /dev/sdc is set as standalone.

Next, we need to modify the partition table on /dev/sdc to change it from software RAID to standard Linux partitions.

fdisk /dev/sdc

Select “p” to change the partition table, then “t” to change the type of partition. Select the partition number from the list. We changed /boot and / to be standard ext3 partitions, which is code 83, and the swap partition was changed to 82. Be sure to select “w” at the end to write all of these changes to the /dev/sdc disk when you are done.

Next, we need to mount the / and /boot partitions of /dev/sdc so that we can change files on that filesystem, so that a reboot on /dev/sdc is possible.

mkdir /mntboot
mkdir /mntroot
mount /dev/sdc3 /mntroot/
mount /dev/sdc1 /mntboot/
vi /mntroot/etc/fstab

Change fstab to so that /dev/sdc partitions will be automounted upon boot, rather than the /dev/md devices. Also, move the mdadm.conf file on /dev/sdc out of the way, so that it cannot be used when booting /dev/sdc.

mv /mntroot/etc/mdadm.conf /mntroot/etc/mdadm.bak

Now, we need to modify grub so that the bootloader will load Linux using /dev/sdc and not the /dev/md device. Notice that I will be doing this on the existing /dev/md running filesystem.

vi /etc/grub.conf

Replace the /dev/md2 (or whatever your root partition is) references with /dev/sdc3 (in our case). Save the file and close it.

Also change /mntroot/boot/grub/grub.conf with the same information.

Next, we need to run mkinitrd to use grub to update the bootloader, so that /dev/sdc will be used on boot. To do this look at the kernel you are booting from in /etc/grub.conf. For us, the mdkinitrd command looked like this:

mkinitrd -f -v /boot/initrd-2.6.9-103.ELsmp.img 2.6.9-103.ELsmp

After you run that, you will the bootloader being reconfigured. When it is complete, you are ready to reboot the server. You should boot up on /dev/sdc and be ready to do the VMware Converter Standalone importer.