Replacing a failed Exadata Storage Server System Drive

I was involved with the procedure to replace a failed system drive in an Exadata Storage Server. The actual physical procedure of removing the old failed drive and inserting the new drive is conducted by an Oracle engineer, but I was connected to the box while this process occurred and monitored the situation afterwards.

It is worthwhile first recapping what the storage cells look like. The V2 box I was working on has 12 600GB SAS drives.

The O/S for storage cells is stored on the first 2 drives. These are called the system drives. A small portion of these drives is carved out for storing the current and backup system images. Joel Goodman has written up a fantastic account of what the various partitions on the 2 system drives are used for. The rest of these drives is also used to be presented to ASM as griddisks.

The various partitions on the system drives in a cell form RAID mirror pairs using the linux software RAID tool mdadm. When replacing a system drive it is important to ensure these mirrored pairs get resynchronised or you will compromise the availability of your cell.

Replacing a system drive therefore involves a bit more risk than just any old storage cell drive, also there are more things that require checking to ensure your cell comes back to full health.

I should point out was the drive had not completely failed, but had gone into predicted failure first thing to look at was the cellcli command list physical disk:

CellCLI> list physicaldisk 20:0 detail

         name:                   20:0 
         deviceId:               8 
         diskType:               HardDisk 
         enclosureDeviceId:      20 
         errMediaCount:          102 
         errOtherCount:          0 
         foreignState:           false 
         luns:                   0_0 
         makeModel:              "SEAGATE ST360057SSUN600G" 
         physicalFirmware:       0805 
         physicalInsertTime:     2010-06-15T01:33:16+01:00 
         physicalInterface:      sas 
         physicalSerial:         E0D6MA 
         physicalSize:           558.9109999993816G 
         slotNumber:             0 
         status:                 predictive failure

So you can see the status has gone to predictive failure. After the engineer had swapped the drives, you can use mdadm to check on the health of your partitions:


[root@cel06 ~]# mdadm -Q --detail /dev/md6 
/dev/md6: 
        Version : 0.90 
  Creation Time : Mon Jun 14 16:59:44 2010 
     Raid Level : raid1 
     Array Size : 10482304 (10.00 GiB 10.73 GB) 
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB) 
   Raid Devices : 2 
  Total Devices : 3 
Preferred Minor : 6 
    Persistence : Superblock is persistent
    Update Time : Mon Oct 10 14:11:36 2011 
          State : clean, degraded, recovering 
 Active Devices : 1 
Working Devices : 2 
 Failed Devices : 1 
  Spare Devices : 1
 Rebuild Status : 27% complete
           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c 
         Events : 0.1426
    Number   Major   Minor   RaidDevice State 
       2      65      214        0      spare rebuilding   /dev/sdad6 
       1       8       22        1      active sync   /dev/sdb6
       3       8        6        -      faulty spare

So you can see that the RAID partition is being rebuilt. Eventually though it did rebuild. But even some time later, still mdadm was complaining about the faulty spare:

[root@cel06 ~]# mdadm -Q --detail /dev/md6 
/dev/md6: 
        Version : 0.90 
  Creation Time : Mon Jun 14 16:59:44 2010 
     Raid Level : raid1 
     Array Size : 10482304 (10.00 GiB 10.73 GB) 
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB) 
   Raid Devices : 2 
  Total Devices : 3 
Preferred Minor : 6 
    Persistence : Superblock is persistent

    Update Time : Wed Oct 19 14:26:00 2011 
          State : clean 
 Active Devices : 2 
Working Devices : 2 
 Failed Devices : 1 
  Spare Devices : 0

           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c 
         Events : 0.5338

    Number   Major   Minor   RaidDevice State 
       0      65      214        0      active sync   /dev/sdad6 
       1       8       22        1      active sync   /dev/sdb6

       2       8        6        -      faulty spare

So we are still stuck with the faulty spare output. It looks like it may be a bug in some versions of the storage cell software. To clear it up do the following:

[root@cel06 ~]# mdadm --manage /dev/md6 --remove failed
mdadm: hot removed 8:6

Then wonderfully, the output from the query command is all clear:

[root@cel06 ~]# mdadm -Q --detail /dev/md6
/dev/md6:
        Version : 0.90
  Creation Time : Mon Jun 14 16:59:44 2010
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Fri Oct 21 10:13:35 2011
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c
         Events : 0.5338

    Number   Major   Minor   RaidDevice State
       0      65      214        0      active sync   /dev/sdad6
       1       8       22        1      active sync   /dev/sdb6
Advertisements

5 thoughts on “Replacing a failed Exadata Storage Server System Drive

    • Hi Uday,

      It’s all about the number of devices you have. starts /dev/sda, /dev/sdb, etc. Then goes /dev/sdaa, /dev/sdab etc.

      jason.

  1. Hi Jason..

    Could u please explain me about the reasons why an exadata storage server system drive crashes..

    • It’s just a hard drive, right?

      For hard drives, it’s not a question of if it will fail, but when it will fail.

      jason.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s