Posted by: isaraffee | August 9, 2009

Exploring RAID in OpenSUSE 11.1 Part 5

Removing Software RAID Configuration

To remove a device from an array you must fail the device:

neptune:~ # mdadm /dev/md0 –fail /dev/sda3

mdadm: set /dev/sda3 faulty in /dev/md0

You can then check your mail if the system alert you of the RAID failure

neptune:~ # mailx

Heirloom mailx version 12.2 01/07/07. Type ? for help.

“/var/spool/mail/root”: 3 messages 3 new

>N 1 root@linux.local Fri Apr 24 10:31 29/845 Fail event on /dev/md0:neptune

N 2 root@linux.local Fri Apr 24 10:31 29/851 Fail event on /dev/md0:neptune

N 3 root@linux.local Fri Apr 24 10:31 29/851 Fail event on /dev/md0:neptune

Yes, the email notification works.

The contents of the email is recorded as follow:

From root@linux.local Fri Apr 24 10:31:01 2009

X-Original-To: root@linux.local

Delivered-To: root@linux.local

From: mdadm monitoring <root@linux.local>

To: root@linux.local

Subject: Fail event on /dev/md0:neptune

Date: Fri, 24 Apr 2009 10:31:01 +0800 (SGT)

This is an automatically generated mail message from mdadm

running on neptune

A Fail event had been detected on md device /dev/md0.

It could be related to component device /dev/sda3.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]

md0 : active raid1 sda2[0] sda3[1](F)

4192896 blocks [2/1] [U_]

unused devices: <none>

The email informed that /dev/sda3 has failed.

Note

To wipe out everything out and start over, you have to zero out the superblock on each device or it will continue to think it belongs to a RAID array.

neptune:~ # mdadm –zero-superblock /dev/sda3

You can still access the mounted RAID on /mnt and read the file contents.

neptune:~ # cd /mnt

neptune:/mnt # ls

lost+found test.txt

neptune:/mnt # cat test.txt

Howdy, RAID πŸ˜‰

I am back from ICT

Now let’s edit the file test.txt and see if we can restore the file content after we fixed the disk problem.

The test.txt now reads as follow:

neptune:/mnt # cat test.txt

Howdy, RAID πŸ˜‰

I am back from ICT

I am now testing failure of one of the disk and writing to RAID

Now unmount the /mnt before we stop the RAID.

neptune:~ # umount /mnt

Stopping the RAID.

neptune:~ # mdadm –stop /dev/md0

mdadm: stopped /dev/md0

Next to read one of the disk partition of the RAID:

neptune:~ # mount -t ext3 -o ro /dev/sda2 /mnt

Next, open the file test.txt to see if the newly appended line is included.

neptune:~ # cat /mnt/test.txt

Howdy, RAID πŸ˜‰

I am back from ICT

I am now testing failure of one of the disk and writing to RAID.

Yes, it is added. Now let’s unmount /mnt and then mount /dev/sda3 to see if the file is updated even after we mark the partition as fail.

Firstly, start the RAID:

neptune:~ # mdadm -A /dev/md0

mdadm: /dev/md/0 has been started with 1 drive (out of 2).

Check the status of the RAID:

md0 : active raid1 sda2[0]

4192896 blocks [2/1] [U_]

unused devices: <none>

It shows that the RAID has only one disk.

Next add the disk, /dev/sda3

neptune:~ # mdadm /dev/md0 –add /dev/sda3

mdadm: re-added /dev/sda3

This will take some time to rebuild, just when you create a new array.

To see the RAID being rebuilt, type:

neptune:~ # cat /proc/mdstat

Personalities : [raid1]

md0 : active raid1 sda3[2] sda2[0]

4192896 blocks [2/1] [U_]

[==>………………] recovery = 14.9% (625984/4192896) finish=6.4min speed=9180K/sec

unused devices: <none>

neptune:~ # cat /proc/mdstat

Personalities : [raid1]

md0 : active raid1 sda3[2] sda2[0]

4192896 blocks [2/1] [U_]

[================>….] recovery = 83.8% (3517568/4192896) finish=1.0min speed=10740K/sec

So the RAID was finally rebuilt.

neptune:~ # cat /proc/mdstat

Personalities : [raid1]

md0 : active raid1 sda3[1] sda2[0]

4192896 blocks [2/2] [UU]

Now let’s check the disk partitions /dev/sda2 and /dev/sda3 and see if the test.txt file is update.

Let’s just check on the /dev/sda3 partitions

neptune:~ # umount /mnt

neptune:~ # mount -t ext3 -o ro /dev/sda3 /mnt

neptune:~ # cat /mnt/test.txt

Howdy, RAID πŸ˜‰

I am back from ICT

From the output above, we can conclude that the failed disk will not have its file test.txt updated.

So now let’s restore the disk partition and see if the file in /dev/sda3 will be updated.

You have to stop the RAID.

neptune:~ # mdadm –stop /dev/md0

mdadm: stopped /dev/md0

neptune:~ # mount -t ext3 -o ro /dev/sda3 /mnt

neptune:~ # cat /mnt/test.txt

Howdy, RAID πŸ˜‰

I am back from ICT

I am now testing failure of one of the disk and writing to RAID.

Yes, the contents of the file are updated after the disk (/dev/sda3) was rebuilt.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: