Keep Disks in your Diskgroup the same size

In my production instances, I have only ever used ASM with external redundancy. Hey, I’m paying expensive fees for fancy hardware RAID, I might as well use it. I therefore tend to present to ASM a LUN which is normally a RAID 10 stripe set. As far as ASM is concerned this is one large disk and it does not have to worry about failure groups.

Well, that is the ideal, but as we know we don’t live in a perfect world. On one of the production instances, due to using “hand me down” hardware, I have created 2 LUNS, and these are of differing sizes. Everything was running along happily until, one of the LUNS ran out of space.

I had thought ASM was meant to distribute extents based on the size of the disks in the diskgroup, so for example, if your diskgroup was made up of 2 disks and one was say 60GB and one was 120GB the 120GB would contain twice as many extents (Allocation Units) as the 60GB disk. This would ensure that one disk in the diskgroup was never filled up while the other one still had plenty of space. Well, it seems that this does not necessarily work perfectly in practice.

So I have a diskgroup lets, call it DATA4 and it is made up of two disks VOL4 and VOL5 and when you look at V$ASM_DISKGROUP this diskgroup has lots of lovely free space:

 SQL> select group_number, name, total_mb, free_mb
 from V$ASM_DISKGROUP;

GROUP_NUMBER NAME		 TOTAL_MB	FREE_MB
------------ ------------------ ----------     ----------
4            DATA4		 220391	       24501

So If you just looked at that view you would be hard pushed to explain why you could not allocate space in your diskgroup. However if you diskgroup is made up of multiple disks take a look at the following view:

SQL> select group_number, name, TOTAL_MB, FREE_MB
from V$asm_disk_stat;
GROUP_NUMBER NAME			      TOTAL_MB	  FREE_MB
------------ ------------------------------ ---------- ----------
	   4 VOL4				124660	    24501
	   4 VOL5				 95731	        0

Oh great! All the available space in the diskgroup is on one of the disks in the diskgroup. ASM is not clever enough to just then allocate new extents to this disk in the diskgroup, it will just keep on doing it’s effective round robin distribution of extents, which means you will get an ORA-15041 error saying the diskgroup space is exhausted. And you’ll be convinced that it ain’t so if you just look at V$ASM_DISKGROUP.

Thankfully, there is help at hand to fix this in the rebalance process. I had thought a rebalance was only required when the storage had physically changed, i.e. adding a new disk, but a rebalance basically evened out where the data was stored:

SQL> alter diskgroup DATA4 rebalance;

You can set a variable level of speed to the rebalance using the power syntax. After the rebalance completed and it took 41 minutes at power 1. I saw the following in V$ASM_DISK_STAT:

SQL> select group_number, name, TOTAL_MB, FREE_MB
from V$asm_disk_stat;    
GROUP_NUMBER NAME			      TOTAL_MB	  FREE_MB
------------ ------------------------------ ---------- ----------
	   4 VOL4				124660	    13859
	   4 VOL5				 95731	    10642

Bingo! I can now allocate new extents in my diskgroup and I have not increased the storage available by 1 byte.

Definitely, it will save you pain if you keep all disks in your diskgroup the same size.

8 thoughts on “Keep Disks in your Diskgroup the same size

  1. Jason, could you share the version of both ASM and database, and the compatibility setting of the ASM diskgroup?

    If the allocation is truly blind round robin, it really is a major flaw in the design, because it will mean that using different sizes of disks diskspace will not be reachable until the diskgroup is rebalanced.

  2. Hi Frits,

    Yep, I realised this was an oversight in the article. The database and ASM versions are running 10.2.0.3 on rhel 4 U3 x86-64. The compatability setting is 10.1.0.0.0 on both the compatibility , and database_compatibility columns of v$asm_diskgroup.

    Which I believe to be the default even with 10gR2.

    Basically that is what is happening to us, the diskgroup is filling up, though has a good 25GB free (around 10-15%), except as you see one disk/lun in the diskgroup being slightly smaller, has got itself filled completely up, while the other disk/lun has all the free space.

    This certainly is not working as advertised.

  3. Yes, 10.1.0.0.0 is the default for both, even in version 11.1.0.6.0.

    I am planning to test a scenario with non-equal sized disks in a diskgroup.

    The oracle documentation states that allocation of extents in ASM is done relative to the total(!) disk size, which means that if you got unequal sized disks, it will fill up relatively the same.

    That means that your scenario probably means that a disk is added or resized and the rebalance needed after that has not happened or is aborted. Could you elaborate on that?

  4. Hi Frits,

    I should mention ASMLib is also in the mix here:

    I have 2.6.9-34 linux kernel (and the asm version 2.0.3-1 for that kernel). asmsupport is also 2.0.3-1, while asmlib is 2.0.2-1.

    What happened and this goes back to August 2006, was the disk group was created and then immediately I added the 2nd lun to the disk group (note the disk group was NOT created with both luns in the same create disk group statement, but a create followed by an add).

    Note there was NO data on the luns before I added the 2nd disk group

    The asm alert log states that AFTER the 2nd disk was added a rebalance of disk group occurred. The alert log claims this rebalance completed successfully. It does seem likely that it has not managed to communicate the correct sizes of the luns.

    Yes on the docs, I read that and was slightly peeved. Looking at the free_mb when the 2nd lun filled up, it has not quite treated both luns as the same size, as there is 100GB on one but only 95GB on the other, close though.

  5. The penny has only just dropped. The rebalance has not actually moved data from the 2nd lun. It has just increased the actual size of the lun. I have not changed the size of the lun. It would seem the first rebalance when the disk group first had the 2nd lun added did not calculate the size correctly.

  6. Hello Fritz/Jarneil,
    Good morning. Posing a question very much related to this and would be of great interest to you. I would like to hear from you experts. Thanks in advance. Here is the problem I am facing.
    Our 2-node RAC env (Grid version is 11.2.0.4 and RDBMS 11.2.0.3) has disks allocated at 20G LUN size (125 of them). Right now SAN admin is going to migrate the disks from spinning to SSDs. But this time he wants to give disks of size 200GB. Both of us are equally concerned on how this will pan out in terms of rebalancing and eventually dropping the old disks (20G LUNs). I see different articles about varying LUN sizes.
    However, when we replace the current disks with SSDs, in theory, after dropping old disks, we should have all 200G LUNs only. He raised concern over rebalancing not happening as expected citing an article. I also opened a SR but the staff is not giving much insights apart from the regular documented knowledge.
    Your insights will be greatly appreciated as this DB is our bread and butter.
    Thanks and Regards
    Kumar Ramalingam

  7. Hello Jarneil,
    We are running two node RAC 11204 on grid and 11203 on database with M10K Solaris 10. My current DiskGroup(DG) is getting exhausted that has 2.5TB allocated using 20G LUNs. My request for additional storage has been provisioned with 200G LUNs. My SAN admin is asking me to create a new DG and allocate space for the existing tablespaces. because the datafiles are allocated to the tablespace from the new DG, would it affect the rebalancing inside the DG? Meaning I can have datafiles from a 20G LUN DG as well as 200G LUN DG?
    Plan is to move the disks to 200G ultimately. Would it be a good start as Oracle says one can move datafiles between diskgroups without any downtime?
    Please advice
    Regards
    Kumar R

Leave a comment