ASMLib Troubleshooting

I’ve noticed a few forum questions regarding ASM or indeed the OUI not being able to see devices that are managed via ASMLib. This prompted me to “upgrade” my knowledge of ASMLib and this blog is just a few extra tools for checking on your ASMLib devices.

By the way, anyone out there thinking ASMLib is not getting a whole lot of love from Oracle of late? The latest updates on the ASMLib page seems to be early 2007.

Anyway, first troubleshooting tip is a simple one, but make sure you have all three ASMLib rpms:


# rpm -qa |grep asm
oracleasm-support-2.0.3-1
oracleasmlib-2.0.2-1
oracleasm-2.6.9-22.ELsmp-2.0.3-1

You get odd behaviour without all of ‘em. So what do each of these provide you:


# rpm -ql oracleasm-support
/etc/init.d/oracleasm
/etc/sysconfig/oracleasm
/usr/lib/oracleasm/oracleasm_debug_link
/usr/sbin/asmscan
/usr/sbin/asmtool

So the init.d oracleasm script is really where you configure disks and includes various options, like listing disks and querying. This is actually just a shell script that calls the executables asmscan and asmtool. There is a configuration file in /etc/sysconfig where you can change things like the pattern to scan for devices and you also have the ability to exclude devices using this configuration file. Excluding devices and explicitly setting the scanorder can be useful for multipath devices.

Once you have ran /etc/init.d/oracleasm configure you should see a new device:


# df -ha |grep asm
oracleasmfs 0 0 0 - /dev/oracleasm


# rpm -ql oracleasmlib
/opt/oracle/extapi
/opt/oracle/extapi/64
/opt/oracle/extapi/64/asm
/opt/oracle/extapi/64/asm/orcl
/opt/oracle/extapi/64/asm/orcl/1
/opt/oracle/extapi/64/asm/orcl/1/libasm.so
/usr/sbin/oracleasm-discover

So this rpm provides you with a library and an executable. Running the executable once you have configured devices is kinda nice:


# /usr/sbin/oracleasm-discover
Using ASMLib from /opt/oracle/extapi/64/asm/orcl/1/libasm.so
[ASM Library - Generic Linux, version 2.0.2 (KABI_V2)]
Discovered disk: ORCL:VOL1 [121634784 blocks (62277009408 bytes), maxio 512]
Discovered disk: ORCL:VOL2 [20971488 blocks (10737401856 bytes), maxio 512]
Discovered disk: ORCL:VOL3 [20971488 blocks (10737401856 bytes), maxio 512]
Discovered disk: ORCL:VOL4 [419424957 blocks (214745577984 bytes), maxio 512]

The final rpm is the kernel module:


# rpm -ql oracleasm-2.6.9-22.ELsmp
/lib/modules/2.6.9-22.ELsmp/kernel/drivers/addon/oracleasm
/lib/modules/2.6.9-22.ELsmp/kernel/drivers/addon/oracleasm/oracleasm.ko

You want to ensure that the oracleasm has been loaded by the kernel:


# /sbin/lsmod |grep oracleasm
oracleasm 55176 1

You can find information about the module with modinfo:


# /sbin/modinfo oracleasm
filename: /lib/modules/2.6.9-22.ELsmp/kernel/drivers/addon/oracleasm/oracleasm.ko
description: Kernel driver backing the Generic Linux ASM Library.
author: Joel Becker
version: 2.0.3
license: GPL
depends:
vermagic: 2.6.9-22.ELsmp SMP gcc-3.4

Make sure the devices you are trying to use are known by the kernel you can check in /dev/ or look in /proc/partitions. ASMLib likes to work on partitions, you can create this on a device using fdisk.

A list of devices marked by ASMLib is generated with:


# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4

You can cross-reference this with what is in the /dev/oracleasm/disks directory:


# ls -l /dev/oracleasm/disks/
total 0
brw-rw---- 1 oracle oinstall 8, 17 Jun 24 09:13 VOL1
brw-rw---- 1 oracle oinstall 8, 49 Jun 24 09:13 VOL2
brw-rw---- 1 oracle oinstall 8, 65 Jun 24 09:13 VOL3
brw-rw---- 1 oracle oinstall 8, 97 Jun 24 09:13 VOL4

You can use querydisk to determine which device a particular ASMLib Volume corresponds to:


# /etc/init.d/oracleasm querydisk VOL1
Disk "VOL1" is a valid ASM disk on device [8, 17]

You can find out which devices this represents with the following:


# grep "8 17" /proc/partitions
8 17 60817392 sdb1

Still paranoid that this might not be your device, check the contents of the disk header:


# od -c /dev/sdb1 |head -10
0000000 001 202 001 001 200 036 - W 310
0000020
0000040 O R C L D I S K V O L 1
0000060
0000100 020 \n 001 003 V O L 1
0000120
0000140 D A T A 1
0000160
0000200 V O L 1
0000220

There is also a neat trick with blkid which shows the disk headers:


#./blkid|grep asm
/dev/sdb1: LABEL="VOL1" TYPE="oracleasm"
/dev/sdd1: LABEL="VOL2" TYPE="oracleasm"
/dev/sde1: LABEL="VOL3" TYPE="oracleasm"
/dev/sdg1: LABEL="VOL4" TYPE="oracleasm"
/dev/sdo1: LABEL="VOL1" TYPE="oracleasm"
/dev/sdq1: LABEL="VOL2" TYPE="oracleasm"
/dev/sdr1: LABEL="VOL3" TYPE="oracleasm"
/dev/sdt1: LABEL="VOL4" TYPE="oracleasm"
/dev/emcpowerf1: LABEL="VOL4" TYPE="oracleasm"
/dev/emcpowerp1: LABEL="VOL3" TYPE="oracleasm"
/dev/emcpowero1: LABEL="VOL2" TYPE="oracleasm"
/dev/emcpowern1: LABEL="VOL1" TYPE="oracleasm"

You can see from the above, that I have multiple devices corresponding to the same physical device and I am using EMC Powerpath as the multipathing software.

Note not all versions of blkid (well it’s actually the E2fsprogs version) pick up oracleasm as a type.

AS you can see there are various techniques to check what devices you have configured via ASMLib for using with your ASM instance!

About these ads
Previous Post
Leave a comment

21 Comments

  1. If you are using emc powerpath as your powerpath version, then why is it using a single path when running your querydisk instead of showing /dev/emcpowerxxx?

    # /etc/init.d/oracleasm querydisk VOL1
    Disk “VOL1″ is a valid ASM disk on device [8, 17]

    You can find out which devices this represents with the following:

    # grep “8 17″ /proc/partitions
    8 17 60817392 sdb1 (shouldn’t this be emcpowerxxx?)

    Reply
  2. jarneil

     /  November 14, 2008

    Hi Lance,

    Thanks for reading!

    Yes, I think the recommended method is to use the emcpower device, however powerpath is clever enough to do the multipathing even if you are not referencing the emcpower device.

    I believe there have historically been some issues with ASMLib using the emcpower devices and the solution was to use the non emcpower devices – though you would still get multipathing.

    jason.

    Reply
    • Cyril

       /  June 21, 2011

      We are having this same problem

      ./oracleasm querydisk -d DISK1

      shows the major/minor node numbers of the underlying device and not the emcpowerpath device

      This is 11gR2 :(

      Reply
  3. Ulf Popeno

     /  January 16, 2009

    Hi jason!

    I find your article very informative, but I just want to mention the importance of /etc/sysconfig/oracleasm , if you find wrong disks being mapped. There is a good way to filter out some unwanted disktypes.

    Reply
  4. John Sobecki

     /  July 6, 2009

    Hi,

    Nice write-up.

    Just set ASM_SCANORDER=”emcpower sd” in /etc/sysconfig/oracleasm and you’ll see that /dev/oracleasm/disks now points to the emcpowerX device, not the individual sdX device.

    Good Day,
    John

    Reply
  5. Need Help – Everytime I try to add a ASM disk to an ASM Diskgroup it hangs the server completely

    We are using a Pillar SAN and have LUNS Created and are using the following multipath device: (I’m a DBA more then anything else… but I am rather familiar with linux …. SAN Hardware not so much)

    Device Size Mount Point
    /dev/dpda1 11G /u01

    The Above device is working fine… Below are the ASM Disks being Created

    Device Size Oracle ASM Disk Name
    /dev/dpdb1 198G ORCL1
    /dev/dpdc1 21G SIRE1
    /dev/dpdd1 21G CART1
    /dev/dpde1 21G SRTS1
    /dev/dpdf1 21G CRTT1

    I try to create to the first ASM Disk

    /etc/init.d/oracleasm createdisk ORCL1 /dev/dpdb1
    Marking disk “ORCL1″ as an ASM disk: FAILED

    So I check the oracleasm log:

    #cat /var/log/oracleasm
    Device “/dev/dpdb1″ is not a partition

    I did some research and found that this is a common problem with multipath devices and to work around it you have to use asmtool

    # /usr/sbin/asmtool -C -l /dev/oracleasm -n ORCL1 -s /dev/dpdb1 -a force=yes
    asmtool: Device “/dev/dpdb1″ is not a partition
    asmtool: Continuing anyway

    now I scan and list the disks

    # /etc/init.d/oracleasm scandisks
    Scanning the system for Oracle ASMLib disks: OK
    # /etc/init.d/oracleasm listdisks
    ORCL1

    Aug 14 13:52:07 seer kernel: end_request: I/O error, dev sdy, sector 0

    Here’s some extra info:

    # /sbin/blkid | grep asm
    /dev/sdc1: LABEL=”ORCL1″ TYPE=”oracleasm”
    /dev/sdk1: LABEL=”ORCL1″ TYPE=”oracleasm”
    /dev/sds1: LABEL=”ORCL1″ TYPE=”oracleasm”
    /dev/sdaa1: LABEL=”ORCL1″ TYPE=”oracleasm”
    /dev/dpdb1: LABEL=”ORCL1″ TYPE=”oracleasm”

    I have learned that by excluding devices in the oracleasm configuration file I eliminate those I/O errors in /var/log/messages

    # cat /etc/sysconfig/oracleasm
    #
    # This is a configuration file for automatic loading of the Oracle
    # Automatic Storage Management library kernel driver. It is generated
    # By running /etc/init.d/oracleasm configure. Please use that method
    # to modify this file
    #

    # ORACLEASM_ENABELED: ‘true’ means to load the driver on boot.
    ORACLEASM_ENABLED=true

    # ORACLEASM_UID: Default user owning the /dev/oracleasm mount point.
    ORACLEASM_UID=oracle

    # ORACLEASM_GID: Default group owning the /dev/oracleasm mount point.
    ORACLEASM_GID=oinstall

    # ORACLEASM_SCANBOOT: ‘true’ means scan for ASM disks on boot.
    ORACLEASM_SCANBOOT=true

    # ORACLEASM_SCANORDER: Matching patterns to order disk scanning
    ORACLEASM_SCANORDER=”dp sd”

    # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
    ORACLEASM_SCANEXCLUDE=”sdc sdk sds sdaa sda”

    # ls -la /dev/oracleasm/disks/
    total 0
    drwxr-xr-x 1 root root 0 Aug 14 10:47 .
    drwxr-xr-x 4 root root 0 Aug 13 15:32 ..
    brw-rw—- 1 oracle oinstall 251, 33 Aug 14 13:46 ORCL1

    Now I can go into dbca to create the ASM instance, which starts up fine… create a new diskgroup, I see ORCL1 as a provision ASM disk I select it … Click OK
    CRASH!!! Box hangs have to reboot it….

    I have gotten myself to exactly the same point right before clicking OK and here is what is in the ASM alertlog so far

    Fri Aug 14 14:42:02 2009
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
    Autotune of undo retention is turned on.
    IMODE=BR
    ILAT =0
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    Starting up ORACLE RDBMS Version: 11.1.0.6.0.
    Using parameter settings in server-side spfile /u01/app/oracle/product/11.1.0/db_1/dbs/spfile+ASM.ora
    System parameters with non-default values:
    large_pool_size = 12M
    instance_type = “asm”
    diagnostic_dest = “/u01/app/oracle”
    Fri Aug 14 14:42:04 2009
    PMON started with pid=2, OS id=3300
    Fri Aug 14 14:42:04 2009
    VKTM started with pid=3, OS id=3302 at elevated priority
    VKTM running at (20)ms precision
    Fri Aug 14 14:42:04 2009
    DIAG started with pid=4, OS id=3306
    Fri Aug 14 14:42:04 2009
    PSP0 started with pid=5, OS id=3308
    Fri Aug 14 14:42:04 2009
    DSKM started with pid=6, OS id=3310
    Fri Aug 14 14:42:04 2009
    DIA0 started with pid=7, OS id=3312
    Fri Aug 14 14:42:04 2009
    MMAN started with pid=8, OS id=3314
    Fri Aug 14 14:42:04 2009
    DBW0 started with pid=9, OS id=3316
    Fri Aug 14 14:42:04 2009
    LGWR started with pid=6, OS id=3318
    Fri Aug 14 14:42:04 2009
    CKPT started with pid=10, OS id=3320
    Fri Aug 14 14:42:04 2009
    SMON started with pid=11, OS id=3322
    Fri Aug 14 14:42:04 2009
    RBAL started with pid=12, OS id=3324
    Fri Aug 14 14:42:04 2009
    GMON started with pid=13, OS id=3326
    ORACLE_BASE from environment = /u01/app/oracle
    Fri Aug 14 14:42:04 2009
    SQL> ALTER DISKGROUP ALL MOUNT
    Fri Aug 14 14:42:41 2009

    Reply
  6. John Sobecki

     /  August 31, 2009

    Hi,

    On the bang/crash part, you need to setup diskdump (RHEL4) or netdump or kdump for RHEL5. Also a serial console is nice. Also what versions of the packages did you install?

    # rpm -qa | grep oracleasm

    Good Day,
    John

    Reply
  7. Hi Jason

    Great Blog!

    I was going to blog about the exact same topic
    I.e. ASM on EMC SAN using PowerPath

    Glad to see you had an answer for multipathing using non-powerpath devices.

    One thing to look out for is EMC PowerPath (EMCpower.LINUX-5.1.2.00.00-021.rhel5.x86_64.rpm) not compatible nor certified against the latest version of Linux kernel 2.6.18-128 / EL5 Update 3

    Causing kernel panic:
    Kernel panic – not syncing: Fatal exception

    - John

    Reply
    • jarneil

       /  September 17, 2009

      Hi John,

      Thanks!

      It can be a full time job just keeping up with which bit of software works/is certified with which. Kinda sad as it leads to a lowest common denominator.

      jason.

      Reply
  8. Ali

     /  October 6, 2009

    Thanks for this helped me solve my asm issue when provisioning an extra node.

    Regards

    Ali

    Reply
  9. abbas

     /  October 31, 2009

    hi i found your post informative, but i’m having troubles with the asm disk i created on my linux4 box. dbca cannot find the asmdisk i created,i’ve changed the discovery path but still it didn’t work. i hope the info below will be useful.
    uname -r:2.6.9-42.0.0.0.1.ELsmp
    i also check my asm drivers and this is what i have: rpm -qa |grep asm
    oracleasm-2.6.9-42.0.0.0.1.ELsmp-2.0.3-2
    oracleasm-support-2.0.3-2
    oracleasm-2.6.9-42.0.0.0.1.EL-2.0.3-2
    please help me out!

    Reply
    • jarneil

       /  November 2, 2009

      Hello,

      You need to run through some of the troubleshooting steps outlined above!

      jason.

      Reply
  10. Nice blog, i like it, its informative,
    i will visit his blog more often.
    i like your article specially about
    ASMLib Troubleshooting

    Cheers

    Reply
  11. sreekanth

     /  May 11, 2010

    Nice blog …

    I am seeing the following error in my alert log ….

    RBAL started with pid=20, OS id=6944
    ERROR: asm_version error. err: driver/agent not installed rc:2
    Errors in file /app/dsss_odbs_bz2t/diag/rdbms/test/TEST/trace/TEST_rbal_6944.trc:
    ORA-15183: ASMLIB initialization error [driver/agent not installed]
    WARNING:FAILED to load library: /opt/oracle/extapi/32/asm/orcl/1/libasm.so
    Tue May 11 10:34:21 2010
    ERROR: asm_init(): asm_erc:-5 msg:Driver not installed pid:6907
    Errors in file /app/dsss_odbs_bz2t/diag/rdbms/test/TEST/trace/TEST_psp0_6907.trc (incident=2449):
    ORA-00600: internal error code, arguments: [kfk_load_by_idx_in_pga9], [1], [0], [], [], [], [], []
    ORA-15186: ASMLIB error function = [asm_init], error = [18446744073709551611], mesg = [Driver not installed]
    Incident details in: /app/dsss_odbs_bz2t/diag/rdbms/test/TEST/incident/incdir_2449/TEST_psp0_6907_i2449.trc
    Errors in file /app/dsss_odbs_bz2t/diag/rdbms/test/TEST/trace/TEST_psp0_6907.trc:
    ORA-00600: internal error code, arguments: [kfk_load_by_idx_in_pga9], [1], [0], [], [], [], [], []
    ORA-15186: ASMLIB error function = [asm_init], error = [18446744073709551611], mesg = [Driver not installed]
    PSP0 (ospid: 6907): terminating the instance due to error 490
    Instance terminated by PSP0, pid = 6907

    Following are the outputs from your ASMLIB troouble shooting guide.

    [srv0001:dsss_odbs_bz2t:] uname -r
    2.6.18-194.el5

    [srv0001:dsss_odbs_bz2t:] rpm -qa |grep asm
    oracleasm-2.6.18-194.el5-2.0.5-1.el5
    oracleasmlib-2.0.4-1.el5
    oracleasm-support-2.1.3-1.el5

    [srv0001:dsss_odbs_bz2t:] /usr/sbin/oracleasm-discover
    Using ASMLib from /opt/oracle/extapi/32/asm/orcl/1/libasm.so
    asm_version() failed with code 2

    Do you have any idea what is wrong? Upon looking at the above output i am under the impression that i installed wrong asmlib rpm. If i am correct i am still confused on which rpm i installed wrong?

    Here is my OS details : Enterprise-R5-U5-Server-i386-dvd.iso

    Thanks a lot for your help ….

    Reply
  12. John Sobecki

     /  May 11, 2010

    Hi sreekanth,

    You did install the i386 oracleasm rpm for your machine?

    Can you run /etc/init.d/oracleasm configure and setup for your oracle:dba user, and
    also configure to start on boot?

    Check for module load:

    # lsmod | grep oracleasm

    Anything in dmesg?

    # dmesg | tail -50

    Thanks, John

    Reply
  13. sreekanth

     /  May 11, 2010

    Here is the layout of my users on the box.

    groupadd -g 550 oinstall
    groupadd -g 556 asm
    groupadd -g 551 dba
    groupadd -g 560 asmdba

    useradd -m -u 552 -g oinstall -G dba,asmdba -d /home/dsss_odbs_bz2t -s /bin/bash -c “Oracle Software Owner” dsss_odbs_bz2t
    useradd -m -u 553 -g oinstall -G dba,asmdba -d /home/doem_ooem_bz2t -s /bin/bash -c “Oracle Software Owner” dsss_ooem_bz2t
    useradd -m -u 554 -g dba -G asmdba -d /home/dsss_olsn_bz2t -s /bin/bash -c “Oracle Software Owner” dsss_olsn_bz2t
    useradd -m -u 555 -g oinstall -G asm -d /home/dssso02t -s /bin/bash -c “Oracle Software Owner” dssso02t

    I am able to add these diskts to ASM. Everything went cool during adding disks and setting up the ASM instance. Problem occured when creating database using DBCA.

    My ASM instance runs under dssso02t user account and database runs under dsss_odbs_bz2t

    When i confifigued disks is used dssso02t as asm user and asm as group.

    [root@srv0001 ~]# lsmod | grep oracleasm
    oracleasm 46100 1
    [root@srv0001 ~]#

    [root@srv0001 ~]# dmesg | tail -50
    sda: Write Protect is off
    sda: Mode Sense: 77 00 00 08
    sdb: Write Protect is off
    sdb: Mode Sense: 77 00 00 08
    SCSI device sda: drive cache: none
    SCSI device sdb: drive cache: none
    SCSI device sda: 88014848 512-byte hdwr sectors (45064 MB)
    SCSI device sdb: 79691776 512-byte hdwr sectors (40802 MB)
    sda: Write Protect is off
    sda: Mode Sense: 77 00 00 08
    sdb: Write Protect is off
    sdb: Mode Sense: 77 00 00 08
    SCSI device sda: drive cache: none
    sda:SCSI device sdb: drive cache: none
    sdb: sdb1
    sda1
    sd 0:0:0:0: Attached scsi disk sda
    sd 1:0:0:0: Attached scsi disk sdb
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    sd 1:0:0:0: Attached scsi generic sg1 type 0
    Bluetooth: Core ver 2.10
    NET: Registered protocol family 31
    Bluetooth: HCI device and connection manager initialized
    Bluetooth: HCI socket layer initialized
    Bluetooth: L2CAP ver 2.8
    Bluetooth: L2CAP socket layer initialized
    Bluetooth: RFCOMM socket layer initialized
    Bluetooth: RFCOMM TTY layer initialized
    Bluetooth: RFCOMM ver 1.8
    eth1: no IPv6 routers present
    Bluetooth: HIDP (Human Interface Emulation) ver 1.1
    ASM: oracleasmfs mounted with options:
    ASM: maxinstances=0
    eth2: no IPv6 routers present
    SCSI device sda: 88014848 512-byte hdwr sectors (45064 MB)
    sda: Write Protect is off
    sda: Mode Sense: 77 00 00 08
    SCSI device sda: drive cache: none
    sda: sda1
    SCSI device sdb: 79691776 512-byte hdwr sectors (40802 MB)
    sdb: Write Protect is off
    sdb: Mode Sense: 77 00 00 08
    SCSI device sdb: drive cache: none
    sdb: sdb1
    mtrr: your processor doesn’t support write-combining
    mtrr: your processor doesn’t support write-combining
    mtrr: your processor doesn’t support write-combining
    mtrr: your processor doesn’t support write-combining
    ISO 9660 Extensions: Microsoft Joliet Level 3
    ISO 9660 Extensions: RRIP_1991A

    Thanks for your help ….

    Reply
  14. sreekanth

     /  May 11, 2010

    More details ….

    [/app/dssso02t/product/11.1.0/asm_1]
    [srv0001:dssso02t:+ASM] sqlplus / as sysdba

    SQL*Plus: Release 11.1.0.6.0 – Production on Tue May 11 13:00:07 2010

    Copyright (c) 1982, 2007, Oracle. All rights reserved.

    Connected to an idle instance.

    SQL> startup
    ASM instance started

    Total System Global Area 284565504 bytes
    Fixed Size 1299428 bytes
    Variable Size 258100252 bytes
    ASM Cache 25165824 bytes
    ASM diskgroups mounted

    SQL> select path from v$asm_disk;

    PATH
    ——————————————————————————–
    ORCL:VOL1
    ORCL:VOL2

    SQL>

    Reply
  15. This note was really very helpful. I was stuck with another issue where the libasm.so was failing.
    found the problem with file in /etc/sysconfig:
    lrwxrwxrwx 1 root root 24 Sep 7 23:30 oracleasm.rpmsave -> oracleasm-_dev_oracleasm

    it should ideally be
    lrwxrwxrwx 1 root root 24 Sep 8 23:36 oracleasm -> oracleasm-_dev_oracleasm

    This caused the libasm.so to not function properly.

    Thanks again.
    Apun

    Reply
  16. MaNiSH NashikkaR

     /  September 16, 2011

    Excellent Notes…thanks for useful info…

    Regards,
    Manish Nashikkar

    Reply
  1. ASMLib and undiscovered disks « Coskan’s Approach to Oracle
  2. Resources added this Week « Oracle Top 5 References's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 56 other followers

%d bloggers like this: