Fixing up ASM Disk Header Corruption

I’m sure this sort of stuff would never happen to you, you would be far too smart for that.

I was involved in a migration project, that was moving data from one set of drives to another, for space reasons just the drives were being replaced the actual chassis was remaining in place. So this involved replacing a handful of drives at a time, migrating data then rinse and repeat until the capacity of the array was increased. So this project did not require Albert Einstein levels of genius just a little bit of forethought and planning.

Documentation, Documentation, Documentation

abstraction

You can see that there are several degrees of abstraction in going from the physical disks all the way up to what Diskgroup ASM is presenting to the actual RDBMS. So in this example the RDBMS is using a Diskgroup called DATA to store the various datafiles that make up the database.

This system was also using ASMLib as well as EMC Powerpath for device multipathing.

Now, none of this should have been a problem, in a well documented system the linkage between which physical devices were actually being used in which diskgroups should have been clear. Unfortunately, this system has grown somewhat organically over time accumulating more and more devices.

I went around checking which devices were in use, just in case there were any devices free: the more free physical devices the better to migrate onto the new larger drives.

In particular I was checking which particular devices were marked as being used with ASM. In our use of ASMLib the naming convention was each device was stamped with a volume name of the form VOL#. So in theory each device should have been marked liked that, any device not in use by ASM should have been able to be reclaimed.

Corruption Leading to Confusion

In performing this check I was the /etc/init.d/oracleasm querydisk command and feeding in a device path:


[jason@bdc]$ sudo /etc/init.d/oracleasm querydisk /dev/emcpowera1 
Disk "/dev/emcpowera1" is marked an ASM disk with the label "VOL1"

So that is all well and good, and then I ran into the following:


[jason@bdc]$ sudo /etc/init.d/oracleasm querydisk /dev/emcpowerm1
Disk "/dev/emcpowerm1" is marked an ASM disk with the label ""

Huh? Now that did seem odd. I was sure all devices in use had the label VOL#, So I did what a DBA in a hurry to migrate drives might do, and thought this device could not be in use. So I tried to delete it:


[jason@bdc]$ sudo /etc/init.d/oracleasm deletedisk /dev/emcpowerm1 
Removing ASM disk "/dev/emcpowerm1":                       [FAILED]

When In a Hole – Stop Digging

At this point I should have stopped and really had a think. In fact I should have checked the disk header to see exactly what was going on with device. I did not not. I incorrectly assumed this was a device that had been in use and was in use no longer. I removed it at the storage level.

After this ASM started up fine and the database even got to the mount stage. Do you think the diskgroup would come online that the datafiles were on? Nope. It was a goner.

I’d just removed a Volume that the diskgroup containing the RDBMS datafiles were depending on. Not only had I removed it from the server, I’d even gone as far to unbind the LUN at the storage array level. Just to make sure it really was a goner.

It was looking like a career limiting move. Thankfully, 7 hours later on the telephone to EMC support, the LUN was able to be resurrected. But that was not the end of the story. ASM still could not understand what to do with this device stamped with “”. I now checked the header of the device:

od

So this device was actually called VOL7 and part of DATA4 diskgroup, which contained the datafiles for the RDBMS. However now compare this to a device that is labelled correctly:

od_working

Seems like a part of the disk header has become corrupted. The following line:


0000040 O R C L D I S K        

Should in fact contain the following:


0000040 O R C L D I S K V O L 7 

Somehow the VOL7 part of this line has been removed.

KFED to the Rescue!

So the database was down, a volume was missing from the diskgroup because the diskheader was corrupted. Not a good place to be, but I was sure the data was still intact, I was sure it was just a matter of fixing up the header and all would be well. I had heard of kfed before this, and I was wondering if this would be the key. I ran it against my corrupt device:

kfed

I could see that the line that had the problem was the following:


kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8

While running a metalink search for kfed, I came across Note: 787082.1 which, while about a completely separate bug, shows you how to edit the provstr of the disk header:



[jason@bdc]$ sudo /etc/init.d/oracleasm force-renamedisk /dev/emcpowero1 VOL7

And that was it! The ASM diskgroup could now find all the volumes it needed to bring the diskgroup back online and the database came back fine. I'm pretty sure any reboot of this server would have led to this device being unrecognised by ASM so was really just an accident waiting to happen, but still maintaining good documentation can never be underestimated.

20 thoughts on “Fixing up ASM Disk Header Corruption

  1. This is again a lifesaver example about ASM Jason.
    I personally dont think same as you for this one “I’m sure this sort of stuff would never happen to you, you would be far too smart for that.” I actually think it can happen to every DBA and it is really nice to have a solution for the case

  2. Hi Coskan,

    Thanks for reading and thanks for your kinds words! I think I was a little gung-ho in removing the device – 7 hours sweating with EMC to get it back was a little on the nerve-racking side.

    Certainly not sure why the device lost the header information though, so hopefully if this does happen to another DBA, then yeah maybe the blog posting will help someone else out there!

    jason.

  3. Well, I don’t think a DBA is a real DBA until they have destroyed their first system – and not an expert DBA until they do it and get it all back again!
    Seriously, well done on getting it all back. That’s a nice tidbit about getting info about disc headers and tying it back to ASM and I must remember about KFED. Thanks Jason.

  4. Hi Jason,

    This was really a nice example, I have always a nightmare of loosing any Lun for large diskgroup. I have started practice of taking backup of ASM disks headers as per Oracle Metalink suggestion.

    But your example is really awesome, An another doubt/question I have in mind which I would like to share.

    a) For a large database (8/9 TB ), what should be the no. of Disk Groups you would like to use.
    I have seen few documents/presentations where people suggests to have 2 Disk groups, one for datafiles
    and another for flash recovery area.

    I also want’s to have one diskgroup for all me datafiles so that we should have good distribution of data
    to well-balanced I/O.

    what’s your thought’s on this ?

    My only nightmare is if anyhow I anyhow I loose one of my Disk/Lun then whole diskgroup can not be mounted.

    where as if we use multi disk groups and that disk-group is only effected then there are chances that disk group contains only index datafiles or any other non-essential datafile (non-system,non-undo)..

    Regards
    J

    • Hello,

      Thanks!

      I’d suggest having 2 Diskgroups, one for Datafiles and one for the FRA. Either use hardware RAID if available, or use ASM normal redundancy if not to guarantee the diskgroup availability should you encounter hardware failure. ASM will automatically stripe data across all disks within the diskgroup to give you the best balance of I/O

      jason.

    • I have a rule of thumb to have a diskgroup no larger than 10 terabytes, which in effect means splitting up the data disk group if you have a very large systems.

      The main reason is the time it takes to do rebalences or other under-the-cover disk maintenance, like swapping over to newer disks you are introducing.

      A minor reason is that we hit (admittedly under 10.1) issues with RMAN and backing up files over 4TB which made me wary of having anything “too big”. I don’t think Oracle Corp are too great at testing VLDB extremes.

      A final consideration is that you might have tierd storage. I’ve had ASM sit on top of stroage that was a fixture of fast fibre channel and slow cheaper disc. We created seperate disk groups on each and placed tablespaces according to how important they were in terms of performance.

      • Hi Martin,

        Excellent points!

        Totally agree if you have a mixture of drive types to not have these in the same diskgroup.

        jason.

  5. Hi Martin – Thanks for excellent reply. I am totally agree with you on.

    Let me share more on this.

    This is our main critical production database (8/9 TB), hence we only use FC ( even for Archive area, do not maintain FRA ). In our env, we have practice if Database is more than 2 TB then we prefer BCV instead of rman directly on prod. we utilized rman on BCV server ( in case of ASM ) to send backup on tape.

    by using 4 parallel channel we get 500GB per hour speed which is ok for us.

    I got only one concern while choosing 1 Disk group for 9 TB database is availability. As per the current setup we are utilizing raw devices which comes from 55+ Luns -> PV -> VG -> LV/raw devices

    Suppose one disk got fail and it’s not recoverable due to any reason despite of Raid 1+0 implemented. ( I am sorry if I am over analyzing but i have seen this in past ).

    So in this case, there are good no. of chances that disk will only destroy 1/2 or more luns ( based on layout ).. and that Lun could be related to any non-critical tablespace.. in this case we can easily open
    the database and later we can start working on recovery part.

    I understand, database availability with few offline files is questionable but still ( in case of consoidation/schema based applications it’s still worth ).

    But if same happens after going to ASM with one diskgroup, in this case I do not see that diskgroup can be mounted until we fix the problem. whole database will be dismounted.

    do you think my worry is irrelevant here ??

  6. Hello Jason,

    Very good your blog, this tip solved a hell of a big problem (if it would not work to restore from a bank of 2 TB)

    Thank you.

  7. Awesome Jarneil,
    I was looking for exactly this. took some time to surf then got it.
    Thanks for posting this.
    — Vivek

  8. Many thanks, you save my day. I’d exactly the same problem and solved successfully with your great solution.

Leave a comment