Expanding an ASM disk

One of the major advantages of ASM is the ability to reconfigure the storage online. In theory you should be able to add disks, remove disks, and resize disks all the while your ASM and RDBMS instances just keep humming along.

However, I don’t think it is actually possible to expand a lun without downtime if you are using ASMLib. Part of the problem seems to be that with ASMLib you have to create a partition and certainly on RedHat 4 Update 3, using kernel 2.6.9-34.ELsmp, to change a partition table required that ASM was not using the disk that the partition table was residing on.

Recently I found this out the hard way when I attempted to increase the size of a lun that was being used by ASM. Expanding the lun on the storage was fairly straightforward on the EMC Clariion on which the data was residing.

I’m not really sure if this is the best way of mapping OS device -> ASM disk:

[jason@bdb ~]$ sudo /etc/init.d/oracleasm querydisk VOL4
Disk "VOL4" is a valid ASM disk on device [8, 1]

I believe this to be the major and minor number of the device, so you can look in /dev to see what device this corresponds to:

[jason@bdb ~]$ ls -l /dev/sda1
brw-rw---- 1 root disk 8, 1 Apr 3 16:29 /dev/sda1

Or indeed thanks to Charles Kim you can run the querydisk the opposite way round:

[jason@bdb ~]$ sudo /etc/init.d/oracleasm querydisk /dev/sda1
Disk "/dev/sda1" is marked an ASM disk with the label "VOL4"

I cannot see in any V$ASM view where this mapping from asm disk -> OS device is exposed, perhaps ASMLib is getting in the way here. What can say is that ASM disk VOL4 maps to /dev/sda1 which is a partiton on /dev/sda. I then increased the size of the lun that this device was created from.

Then comes the scary part, getting the OS to see the increased lun. This was running on rhel 4 update 3, and after a reboot I could see the following via fdisk:

[jason@bdb ~]$ sudo /sbin/fdisk /dev/sda

Command (m for help): p

Disk /dev/sda: 429.4 GB, 429496729600 bytes
255 heads, 63 sectors/track, 52216 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 1 32635 262140606 83 Linux

So the OS can see that the device /dev/sda now has 52216 cylinders but only 32635 (which was the original lun size) have been allocated to the partition /dev/sda1. Now you can actually just delete the partition and recreate it without losing any data:

Command (m for help): d
Selected partition 1

This has deleted the partition

Command (m for help): p

Disk /dev/sda: 429.4 GB, 429496729600 bytes
255 heads, 63 sectors/track, 52216 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Now you have to recreate it:

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-52216, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-52216, default 52216):
Using default value 52216

Now we can see the /dev/sda1 partition is up to the full capacity of the underlying lun:

Command (m for help): p

Disk /dev/sda: 429.4 GB, 429496729600 bytes
255 heads, 63 sectors/track, 52216 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 1 52216 419424988+ 83 Linux

Don’t forget to write the changes out:


Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

I actually had ASM shut down at this point because fdisk had previously stated the device was busy (when trying to write the new partition table) and that the kernel would still use the old partition:


WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

Once ASM (and obviously the RDBMS instance relying on this ASM instance) was down, I was able to write the partition table. Without changing the partition table, ASM would not recognise that the luns had been increased. After this partition table was written, getting ASM to increase what it thought was the size of disk was quite simple:


SQL> select group_number, name, TOTAL_MB, FREE_MB
from V$asm_disk_stat; 2

GROUP_NUMBER NAME TOTAL_MB FREE_MB
------------ ------------------------------ ---------- ----------
1 VOL1 61439 61187
2 VOL2 61439 61164
3 VOL3 61439 61164
4 VOL4 255996 157374
4 VOL5 153597 95230

SQL> alter diskgroup DATA4 resize all rebalance power 4;

Diskgroup altered.

SQL> select group_number, name, TOTAL_MB, FREE_MB
from V$asm_disk_stat; 2

GROUP_NUMBER NAME TOTAL_MB FREE_MB
------------ ------------------------------ ---------- ----------
1 VOL1 61439 61187
2 VOL2 61439 61164
3 VOL3 61439 61164
4 VOL4 409594 310962
4 VOL5 153597 95240

So what am I saying, well it seems that with ASMLib it is hard to resize a disk completely online with ASM, but due to the fact that the Linux partition table cannot be re-written while ASM has open the device.

Perhaps, this is a disadvantage of ASMLib compared to running without out, though for my money ASMLib seems to be the favoured Oracle solution, certainly I think it is pushed in the documentation.

Advertisements

17 thoughts on “Expanding an ASM disk

  1. Hi Chris,

    True enough it has required some downtime. One problem with your solution, is I have never consistently found a way of having a Linux OS find a new device. With good ‘ol Solaris I do a devfsadm and there is my device rebuilt with my shiny new LUN.

    What the Linux equivalent of devfsadm is, I know not.

    cheers,

    jason.

  2. One method of syncing device modifications with the kernel on linux is using the partprobe command. I don’t know if it’s ASMLib which gets in the way, or if it’s the same without ASMLib. I will try and see if I can test this tonight. I’ll keep you updated.

  3. Hi Frits,

    Thanks for reading!

    partprobe, certainly on my rhel4 install is complaining about devices being being busy, and that the kernel won’t know about any changes until a reboot. Of course a new device may be a different matter, but the man page says nothing about building the device tree!

    That would be very interesting to hear about how expanding a disk not under asmlib goes, i was meaning to get round to that sometime.

    cheers,

    jason.

  4. good afternoon,

    well thought to workaround the problem.
    But … dropping a data partition and recreating it later is a delicate operation. Even so I understand the logic, I would not advise to do this on a production system!

    regards,
    eric

  5. Hey there, the fundamental issue is Linux’s handling of disk resizes. Even without a partition table, Linux does not want to resize an active device – that is, even if you had the whole disk, Linux wouldn’t want to adjust the kernel’s idea of its size while it was busy. You see this here manifested because you want to re-read the partition table.

    Eventually Linux will have a more hotplug friendly idea of disks, and these limitations will change. When they do, ASMLib should have the ability to see it – ASMLib merely reports the OS’s idea of disk size ot ASM.

  6. Hi Joel,

    Thanks for dropping by!

    That is a shame then, seems like you cannot expand a lun online with ASM.

    Of course I expect the orginal design goal is that you give ASM lots a disk NOT luns and if you need more storage add more disks. ASM is designed in some ways to eliminate the need for expensive hardware RAID solutions.

    cheers,

    jason.

  7. Please, take a look at:

    Doc ID: 311619.1
    Subject: How to resize a physical disk or LUN and an ASM DISKGROUP

    on Metalink

    As far as I understand, it’s not possible to make online resize in a non-RAC environment.

  8. Another way can be to create another drive on the extended free space and add it as a separate ASM disk to the diskgroup. Although it is not the best solution.

  9. Since the device is already available you have to rescan the scsi disk to pickup the new size without rebooting in RedHat

    For example

    echo 1 > /sys/bus/scsi/drivers/sd/SCSI_ID/block/device/rescan

    you can do a cat on the scsi folder to find the SCSI_ID of the device

    cat /proc/scsi/scsi

    In my situation. I have 2 paths and 2 controllers so you 4 SCSI IDS for the EMC LUN on the host. 2 host controller and 2 paths to each controller. (You need to find the Host LUN. NOT THE LUN # in EMC. You find this under host properties and the storage tab in navisphere.

    You then can issue a rescan for each device. (I don’t think you have to do this but I did it just to make sure all 4 devices updated.)

    echo 1 > /sys/bus/scsi/drivers/sd/1:0:0:22/block/device/rescan
    echo 1 > /sys/bus/scsi/drivers/sd/1:0:1:22/block/device/rescan
    echo 1 > /sys/bus/scsi/drivers/sd/2:0:0:22/block/device/rescan
    echo 1 > /sys/bus/scsi/drivers/sd/2:0:1:22/block/device/rescan

    Now you can see the new size without rebooting.

  10. Hello,

    I have a diskgroup +DATA which use a device /dev/mapper/mpath7 (70G).

    I use asmlib and multipath .

    I want to add space to +DATA by extending the physical device /dev/mapper/mpath7.

    The SAN administrator told me that the LUN have been extended to 100G, but when i do the multipath -l command, i always see 70G.

    I plan to do this :

    1. Tell to the system that the LUN had been extended

    # echo 1 > /sys/block/device_name/device/rescan

    # multipathd -k’resize map mpath0′

    2. Do the diskgroup resize on the ASM INSTANCE

    My questions:

    -Are My steps fine?
    – I have a two node RAC : do I need to shutdown one ASM instance before resizing the diskgroup

    Thanks for any helps

    • you should not have to shutdown ASM, as long as you can get the is to see the resized device.

      is that the only device in the diskgroup? if not you need a rebalance. make sure you keep similar sized luns with the same performance characteristics or you will get suboptimal performance.

  11. Thank for replying.

    In fact it is the same LUN, we extended it. (We do not add new disk)

    So if i understand your answer , steps below are fine

    1. Tell to the system that the LUN had been extended
    # echo 1 > /sys/block/device_name/device/rescan
    # multipathd -k’resize map mpath7′

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s