Applying an incremental backup to a physical standby

June 3, 2008

I recently had the situation where a physical standby (running 10.2.0.3) had got itself so far out of date, that transferring the required archived redo logs would have necessitated retrieving them from tape. As a far quicker method of getting this standby back up-to-date (as opposed to a complete rebuild), I decided to apply an incremental backup to the standby.

This procedure is actually reasonably well documented. Except this leaves the controlfile at one SCN while all your datafiles have marched up to the new scn. Upon restarting the managed recovery process, the standby is still asking for ancient archived logs that your datafiles already have the changes applied.

It is interesting to contrast the instructions for 10g above with similar, though not the same instructions for performing this on 11g. The 11g instructions ask for the controlfile to be restored as well. Except the restore standby controlfile did not work on 10.2.0.3 instance.

So here is my guide to applying an rman incremental backup to a physical standby.

    Find out where your standby thinks you have applied up to:

standby> alter database recover managed standby database cancel;

standby> select current_scn from v$database;

    Make an rman incremental backup on the primary from the above scn:

primary_rman> backup incremental from scn #### database format '/backup/standby_%U';

    Once this backup file is copied to the standby, catalog it with the standby controlfile:

standby_rman> catalog start with '/backup/standby';

    Now actually apply the changes from the backup to the standby:

standby_rman> recover database noredo;

    At this point your datafiles will be up-to-date, but your controlfile is still out of date, create new standby controlfile from the primary:

primary> alter database create standby controlfile as '/tmp/standby.ctl';

Copy this to the standby and replace your current standby controlfiles with this new one. Once this is done you should be able to restart managed recovery and it will take off from your backup scn.

It is this final stage that does not seem to be covered to well with the documentation. Certainly I found I needed this new standby controlfile after applying an incremental backup to my standby.


Comparing ASM with ZFS

May 21, 2008

Here is a copy of the document I produced for UKOUG delegates attending my presentation on Comparing ASM with ZFS document.

I’m also making available the slides I used, though be WARNED these are 25MB in total (for 22 slides) so think before downloading!

I think in the end I was more comfortable talking about ASM than I was on ZFS, and I did not really do the ZFS part justice. I’m not all that convinced that the actual talk works all that well. I ran over and even then did not cover everything about ZFS that I wanted to.

With that in mind, I have put in two other abstracts for the UKOUG Birmingham conference. I think splitting up the talk into two seperate talks is really the only way to tackle it. Maybe it was like comparing apples & oranges after all.


What ASM can do for you

This will give insight into how ASM works and an understanding of its features. You will gain an insight in the benefits and drawbacks of running a database on ASM. I will cover:

  • Introduction to ASM
  • disk discovery
  • diskgroups
  • extents
  • mirroring
  • rebalancing
  • metadata
  • myths
  • and for ZFS:

    What ZFS can do for you

    This would give insight into how ZFS works and an understanding of its features. You will gain an insight in the benefits and drawbacks of running a database on ZFS. I would aim to cover:

  • Introduction to ZFS
  • Data Integrity
  • Creating a pool
  • Snapshots & Cloning
  • Particularly with the ZFS talk I’d like to do a few demos so people can see just how easy administration with ZFS really is.


    UKOUG UNIX SIG

    May 21, 2008

    I’m clearly doing something wrong. Doug Burns is off cruising down a Scandinavian fjord, while I’ll get to do Solihull! I’m afraid I thought the venue was quite poor, a bit of a pain to get to via train and boy was the lunch stingy!

    Support Update: Phil Davies

    Phil asked the audience how many were on 9i, and astonishingly 3/4 of the audience were on 9.2! This is completely and utterly at odds with the RAC SIG. I wonder if with RAC you have more incentive to upgrade, or perhaps it’s the case that most RAC installs are more recent, therefore the choice being 10gR2.

    rman do or die Roland Brown and Emily Taylor

    This was a real user experience story. It really reminded me how much I do not want to work in the public sector. They were using Veritas Netbackup for backing up their databases but they did not have the Netbackup Oracle agent. So as late as 2004/2005 they were backing up their databases by having Netbackup do a cold backup of their database files. They then had a nightmare trying to persuade their management to fork out for the database agents (total cost £20K), even though they were demanding 24×7 access to the databases! I also found it interesting the timeframe they stated that it took them to upgrade Netbackup to version 6, seems like they had as much hard a time at persuading all the interested parties (management, SA’s) to move to rman as they had actually with rman.

    Lets get Virtual: Clayton Blake & Tom Dale

    This was an amusing talk regarding using VMWARE at BERR. Again another presentation from the public sector, and again they had real hardware limitations so they chose to run several tiers of an application on the one box and use VMWARE to perform the virtualisation. They clearly had done the virtualisation quite well in terms of cloning from a “golden copy”. It seemed to make their deployment far more agile. This was quite an amusing talk and though the dolly the sheep metaphor for cloning is not new, the dolly with a red hat I found quite funny.

    Oracle ILM - Lilian Hobbs

    Lilian used to be the product manager for ILM inside Oracle but has recently left Oracle to become independent. She obviously new her ILM and presented it well. The basic idea is not to delete old data but to move it to cheaper storage. The way the size of disks are going this is obviously becoming cheaper and cheaper. This was the first time I had heard of the concept of MAID (Massive Array of Idle Disks). The idea behind this being that you have a big cabinet of cheap disks but only have some spinning at a time and if you want to access data on a disk that is idle, you just have to wait a bit longer while the drive spins up. This is a clever idea for read only or archive data.

    I could not help thinking that if you are storing these huge quantities of data that ZFS and disk scrubbing might really be of benefit.

    11g partitioning new features: Joel Goodman

    This was a fairly comprehensive look at partitioning and in particular new features available in 11g. I must confess I did not hear much of this of talk, even though I was in the room, as I was far to alarmed at my presentation which was after Joel’s.

    Comparing ASM with ZFS: ME
    It’s not fun being last presenter of the day. The audience had thinned quite substantially I thought, though I don’t think the events in the Midlands attract quite the same numbers anyway. This was the first presentation zen style presentation I had done and I must say I found it a little harder than a traditional bullet point presentation. Twice I completely forgot what I wanted to say, both times on ZFS content - I was a little disappointed I did not really get across all I wanted to regarding the design of ZFS. I thought the ASM content went much better. I also found people asking questions during the presentation really, really put me off my stride and I found it difficult to remember after answering the question exactly where I had got to.

    It will be interesting seeing the feedback on this.


    Registering archive logfiles on a standby

    May 16, 2008

    I’ve been having some issues with a physical standby running Oracle 10.2.0.3. This has caused us to have to manually register logfiles to bridge gaps in the archive sequence on the physical standby.

    One way of registering a logfile is via the following:

    SQL> alter database register logfile
    ‘/var/arch/arch_1_101.arc’;

    So this is a straightforward way of registering a logfile giving the full path to the logfile. However, what if you have a very large number of logfiles to register, the above does not really scale all that well. Thankfully there is an rman command that enables you to register all logfiles within a directory:

    rman> catalog start with '/var/arch';

    This can save you a lot of time if you need to register a large number of files.


    UKOUG RAC & HA SIG

    May 15, 2008

    Introduction

    A rainy day london for the RAC & HA Sig.

    First up as per usual was the survey of who is using what, Julian dyke

    9.2 a few
    10.1 a few
    10.2 almost eveyone
    11.1 two people in production

    solaris on sparc around 10 a similiar number to AIX and 32 bit linux

    The majority on linux x86-64. A little sprinkling of windows. No one on windows itanium. One person shouting out for VMS.

    majority on 2 nodes. a few with 3, and a few with 4. A few with 3 One guy with 10 nodes and one with 12.

    everyone on SAN hardly any with NAS.

    Going on for 1/2 or more using ASM, a sprinkling on OCFS.

    A good lot running physical standby with RAC and a surprising sprinkling using logical standby.

    Only around 20% applying critical patch updates

    Phil Davies Support Update

    Interesting problem about IOT corruption when doing a 9i to 10g upgrade Metalink note: 471479.1.

    Apparently Oracle developers now have to justify why a patch cannot be rolling.

    Joel Goodman Managing Sequences in RAC

    In some ways the name sequence is a misnomer as they are not actually guaranteed to be sequential, there may be gaps, they may even wrap around. They need not start with 1 or increment by 1.

    Sequence cant be rolled back.

    SQ Enqueue is used during cache replenishment.

    gaps occur due to sql being rolled back (seuence number can’t be rolled back it’s an autonomous transaction to get the sequence number)
    metadata ages out of row cache
    instance shutdown.

    In RAC there is the option of order or noorder. This controls the ordering of generated numbers. In increasing levels of performance:

    nocache and order
    nocache and noorder
    cache and order
    cache and noorder

    With cache in RAC each instance caches it’s own set of numbers

    Simon Haslam Fast Connection Failover

    FCF basically means subscribers (app servers) get notified of events occuring on the db tier.

    Possibility of using JDBC driver without an Oracle client with FCF.

    Simon had a demo of a rac node failing and the app server still being used for queries without any intervention. JDBC driver has some methods that can be used with FCF.

    Be interesting to see if this will work with tomcat, but as it’s at the jdbc level there is no reason it should not.

    Mo Beik Sun & HA

    This was a marketing talk, seems somewhat outdated, I’m sure there are lots of customers using Sun, Oracle Veritas, but really with 10g (&11) oracle are out to kill veritas anyway.

    Claimed Oracle 11.2 was slated for end of year 2008. Hmm, is this going to be announced at Openworld?

    He showed some graphs comparing ASM with SharedQFS, and as the number of nodes in a cluster increased (6, 7 & 8 nodes) SharedQFS seemed (in these sun tests) to actually outperform ASM. At lower number of nodes there was not really anything in it.

    Harald Van Breederode Client Connectivity in a Dataguard Environment

    how can one prevent clients from connecting to the wrong database in a dataguard environment?

    Harald’s idea is using database services and after startup triggers to set services based on events happening on the database, e.g. if the database becomes read only, then a service is started that reflects this. It is now just a case of ensuring that clients connect to the appropriate service. This can be used in case of switchovers or failovers.

    This was a really clever way of ensuring application connectivity in a dataguard environment.

    Martin Bach Oracle RAC on Oracle VM

    paravirtualisation vm’s do not (necessarily) simulate hardware
    vm performance closer to physical hardware but requires kernel changes.

    This seemed like a great way of having a play with RAC without having to have additional hardware.

    Julian Dyke’s Tour

    Julian has been presenting in the far east and Australasia. He has encountered quite a few CRS corruptions he emphasized how important it is to backup your OCR automatic backups.

    In Summary

    As usual highly worthwhile event.


    ASM Mirroring

    May 12, 2008

    ASM can provide RAID like protection for your data. ASM provides redundancy via failure groups, essentially when you create a disk group you assign disks that are members of the disk group to a particular failure group. ASM will then mirror the data between these failure groups. The idea being you keep disks that are dependent on the same hardware in the same failure group, so failure of a particular component does not impact the availability of your disk group.

    ASM actually mirrors at the extent level. When an extent is allocated there is the concept of a primary extent and a secondary extent, essentially a primary copy and a mirror copy and these are allocated within different failure groups. The above diagram shows primary extents in purple and secondary extents in red (each square represents an extent).

    By default, ASM always reads the primary copy of an extent. At first I thought this was a big limitation and would reduce the I/O bandwidth available, but of course ASM mirroring ensures the primary copies are spread across the failure groups so that the I/O is spread over the maximum number of drives.

    Frits Hoogland pointed out to me this may still be less efficient than traditional mirroring in that ASM has no chance of optimising a read based on which underlying physical device has the closest disk head position.

    To be fair, Oracle themselves to state that external redundancy is preferred unless you have particular requirements that can only be provided with a software RAID solution. Oh and yeah, preferred_read_failure_groups seem like a real win for extended clusters if that is your bag.


    ASM Extents

    May 6, 2008

    Every ASM disk is divided into allocation units (au). ASM files are stored as extents and an extent consists of one or more allocation unit, though it was only 11g that brought in variable sized extents. The ASM instance provides the RDBMS instance with an extent map that the RDBMS instance then uses when doing I/O.

    The diagram above is meant to show the extents of a pair of ASM files distributed amongst the available drives in a disk group. Essentially this is the algorithm that ASM uses to maximise the I/O performance - spread all data across the disks in a disk group.

    When you create a disk group in 11g you can specify the size of the allocation unit to be from 1MB to 64MB, the size doubling between these limits. That is you can set the size of the au for a disk group to be one of 1, 2, 4, 8, 16, 32, or 64MB.

    Clearly the larger the au size chosen the less the number of extents it will take to map a file of a given size. The larger au are clearly beneficial for large data files and cuts down on SGA required to track. Each individual extent resides on a single disk.

    Extents can vary in size from 1 au to 8 au to 64 au. The number of au a given extent will use is dependent on the number of extents allocated and the extent size increases at a threshold of 20,000 extents to 8 and then again at 40,000 extents to 64. Again this is designed to be beneficial to larger data files, requiring less extents to be tracked.

    You can see how the extents are allocated between disks in a disk group by looking at the X$KFFXP view:


    SQL> select count(*), group_kffxp, disk_kffxp

    from X$KFFXP
    group by group_kffxp, disk_kffxp
    order by group_kffxp;

    This will show you how many au have been allocated to each disk, if you have a healthy balanced system each disk in a disk group should have a similar number of au.

    A very useful script for looking at all this is available on metalink, look for Note: 351117.1, diagnosing ASM space issues, well worth having a look at.


    ASM and disk “hot spots”

    May 1, 2008

    I’ve seen repeated in various locations that ASM somehow has the ability to move data around in response to how much I/O is ocurring on each of the disks that ASM is managing. The theory goes that by doing this, ASM is able to balance the I/O amongst all the drives, thus giving your RDBMS instance(s) that are using ASM the absolute tip-top I/O performance that could possibly be achieved given your hardware limitations.

    Sounds great? Trouble is, it just is not true. This idea has gained a bit of traction in the community, and I’m sure many people think ASM is perhaps more clever than it actually is. Whether this is due to marketing terminological inexactitude, i’ll leave up to the reader to decide.

    The only metric ASM uses when determining where data should be located is the capacity of the disks in a disk group. ASM’s goal in placing data is to ensure every drive is filled to the same amount. Therefore if you have a disk group of equal size they will receive the same amount of data. The theory being that by spreading the data evenly across the drives you will achieve good I/O performance as both drives are likely to be serving the same number of I/O requests.

    ASM does expose some data on how many requests each disk in a disk group is performing, this is via V$ASM_DISK_STAT:

    SQL>select group_number, disk_number, read_time, write_time, bytes_read, bytes_written

    from V$ASM_DISK_STAT;

    GROUP_NUMBER DISK_NUMBER  READ_TIME WRITE_TIME BYTES_READ BYTES_WRITTEN
    ------------ ----------- ---------- ---------- ---------- -------------
    
    4                0       14910505.7 4626148.24 4.1821E+13 1.4998E+12
    4                1       14965432.3 5324739.98 4.1833E+13 1.6264E+12 
    

    There are two disks in this disk group , which are actually of equal size. They have both read and written a similar quantity of data, though it is not exactly equal. The average write time shows a bigger discrepancy than the read times.

    Basically, the point is that for equal sized disks in a disk group the ASM algorithm of distributing data according to capacity works reasonably well.

    But consider if you had different sized disks in a disk group. A larger disk gets more data. Is a larger disk actually quicker at returning that data? Well, probably not, and the larger the discrepancy in sizes the larger the skew of I/O there will be.

    Maybe one day, ASM will have the ability shift data based on the I/O activity of the underlying drives but until then, make sure all the disks you have in a disk group are of the same size (oh and same performance characteristic). That way you’ll protect yourself from any I/O hot spots that ASM won’t quite save you from yet!


    Dropping a disk in an ASM Disk Group

    April 29, 2008

    One of the big selling points of ASM is the ability to reconfigure the storage online. Previously I’ve blogged about expanding a disk in a disk group. Another useful feature of ASM is to use it to migrate from one set of disks in a disk group to another, or indeed from one storage array to another.

    This is basically achievable because ASM distributes data across all disks in a disk group evenly, and assuming you have enough space, you can happily drop disks in a disk group and ASM will seamlessly migrate the data to the existing disks in the disk group.


    SQL> select group_number, name, TOTAL_MB, FREE_MB
    from V$asm_disk_stat;

    GROUP_NUMBER      NAME          TOTAL_MB    FREE_MB
    ------------ ---------------- ---------- ----------
    	   1      VOL1		61439	     61187
    	   2      VOL2		61439	     61164
    	   3      VOL3		61439	     61164
    	   4      VOL4	       409594	    310962
    	   4      VOL5	       153597	     95240
    

    So, we see here that VOL4 and VOL5 are two disks (luns) in disk group 4. Previously I had expanded VOL4 and this now has enough capacity to encompass all the data resident on this disk group. I am now safe to drop VOL5 and this is an online operation:

    SQL> alter diskgroup DATA4 drop disk VOL5;

    Diskgroup altered.

    This alter diskgroup command essentially shuffles extents from the disk you are removing and distributes them to the remaining disks in your disk group. While the operation is continuing you can check V$ASM_OPERATION for the progress you are making:

    SQL> select * from v$asm_operation;

    GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE  EST_MINUTES
    ------------ ----- ---- ----- ----- ------ -------  ---------- ----------
    	   4 REBAL RUN     1	 1   100   42234       1007     41 

    Most of the columns here are self explanatory, however the SOFAR column tells you the number of Allocation Units (au) that have been moved, the EST_WORK and EST_RATE are also in au and au/minute.

    Once the rebalance has moved all the Allocation Units the disk is removed from the disk group:


    SQL> select group_number, name, TOTAL_MB, FREE_MB
    from V$asm_disk_stat;

    GROUP_NUMBER      NAME          TOTAL_MB    FREE_MB
    ------------ ---------------- ---------- ----------
    	   1      VOL1		61439	     61187
    	   2      VOL2		61439	     61164
    	   3      VOL3		61439	     61164
    	   4      VOL4	       409594	    252006
    

    Dropping a disk in a disk group seemed to work as advertised, the real benefit of course, is instead of it being just a disk you were dropping but that it was a lun representing a whole storage array, then this has real potential for allowing you to upgrade storage or even migrate to a different storage platform entirely.


    Expanding an ASM disk

    April 22, 2008

    One of the major advantages of ASM is the ability to reconfigure the storage online. In theory you should be able to add disks, remove disks, and resize disks all the while your ASM and RDBMS instances just keep humming along.

    However, I don’t think it is actually possible to expand a lun without downtime if you are using ASMLib. Part of the problem seems to be that with ASMLib you have to create a partition and certainly on RedHat 4 Update 3, using kernel 2.6.9-34.ELsmp, to change a partition table required that ASM was not using the disk that the partition table was residing on.

    Recently I found this out the hard way when I attempted to increase the size of a lun that was being used by ASM. Expanding the lun on the storage was fairly straightforward on the EMC Clariion on which the data was residing.

    I’m not really sure if this is the best way of mapping OS device -> ASM disk:

    [jason@bdb ~]$ sudo /etc/init.d/oracleasm querydisk VOL4
    Disk “VOL4″ is a valid ASM disk on device [8, 1]

    I believe this to be the major and minor number of the device, so you can look in /dev to see what device this corresponds to:

    [jason@bdb ~]$ ls -l /dev/sda1
    brw-rw—- 1 root disk 8, 1 Apr 3 16:29 /dev/sda1

    Or indeed thanks to Charles Kim you can run the querydisk the opposite way round:

    [jason@bdb ~]$ sudo /etc/init.d/oracleasm querydisk /dev/sda1
    Disk “/dev/sda1″ is marked an ASM disk with the label “VOL4″

    I cannot see in any V$ASM view where this mapping from asm disk -> OS device is exposed, perhaps ASMLib is getting in the way here. What can say is that ASM disk VOL4 maps to /dev/sda1 which is a partiton on /dev/sda. I then increased the size of the lun that this device was created from.

    Then comes the scary part, getting the OS to see the increased lun. This was running on rhel 4 update 3, and after a reboot I could see the following via fdisk:

    [jason@bdb ~]$ sudo /sbin/fdisk /dev/sda

    Command (m for help): p

    Disk /dev/sda: 429.4 GB, 429496729600 bytes
    255 heads, 63 sectors/track, 52216 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System
    /dev/sda1 1 32635 262140606 83 Linux

    So the OS can see that the device /dev/sda now has 52216 cylinders but only 32635 (which was the original lun size) have been allocated to the partition /dev/sda1. Now you can actually just delete the partition and recreate it without losing any data:

    Command (m for help): d
    Selected partition 1

    This has deleted the partition

    Command (m for help): p

    Disk /dev/sda: 429.4 GB, 429496729600 bytes
    255 heads, 63 sectors/track, 52216 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System

    Now you have to recreate it:

    Command (m for help): n
    Command action
    e extended
    p primary partition (1-4)
    p
    Partition number (1-4): 1
    First cylinder (1-52216, default 1):
    Using default value 1
    Last cylinder or +size or +sizeM or +sizeK (1-52216, default 52216):
    Using default value 52216

    Now we can see the /dev/sda1 partition is up to the full capacity of the underlying lun:

    Command (m for help): p

    Disk /dev/sda: 429.4 GB, 429496729600 bytes
    255 heads, 63 sectors/track, 52216 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System
    /dev/sda1 1 52216 419424988+ 83 Linux

    Don’t forget to write the changes out:


    Command (m for help): w
    The partition table has been altered!

    Calling ioctl() to re-read partition table.
    Syncing disks.

    I actually had ASM shut down at this point because fdisk had previously stated the device was busy (when trying to write the new partition table) and that the kernel would still use the old partition:


    WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
    The kernel still uses the old table.
    The new table will be used at the next reboot.
    Syncing disks.

    Once ASM (and obviously the RDBMS instance relying on this ASM instance) was down, I was able to write the partition table. Without changing the partition table, ASM would not recognise that the luns had been increased. After this partition table was written, getting ASM to increase what it thought was the size of disk was quite simple:


    SQL> select group_number, name, TOTAL_MB, FREE_MB
    from V$asm_disk_stat; 2

    GROUP_NUMBER NAME TOTAL_MB FREE_MB
    ———— —————————— ———- ———-
    1 VOL1 61439 61187
    2 VOL2 61439 61164
    3 VOL3 61439 61164
    4 VOL4 255996 157374
    4 VOL5 153597 95230

    SQL> alter diskgroup DATA4 resize all rebalance power 4;

    Diskgroup altered.

    SQL> select group_number, name, TOTAL_MB, FREE_MB
    from V$asm_disk_stat; 2

    GROUP_NUMBER NAME TOTAL_MB FREE_MB
    ———— —————————— ———- ———-
    1 VOL1 61439 61187
    2 VOL2 61439 61164
    3 VOL3 61439 61164
    4 VOL4 409594 310962
    4 VOL5 153597 95240

    So what am I saying, well it seems that with ASMLib it is hard to resize a disk completely online with ASM, but due to the fact that the Linux partition table cannot be re-written while ASM has open the device.

    Perhaps, this is a disadvantage of ASMLib compared to running without out, though for my money ASMLib seems to be the favoured Oracle solution, certainly I think it is pushed in the documentation.