It was a funny start to my day, my kids woke up in the middle of night, and at 03:00am my fire alarm went off, which gave me a quite a fright, though thankfully nothing was actually on fire. So I was pretty knackered as I got into London on a beautifully sunny day for the UKOUG RAC & HA meeting. So many familiar faces in the audience – which is great to see.
Preliminary Introduction – Dave Burnham
Before the survey of who is using what, Dave Burnham mentioned that the following meeting in Blythe Valley Park, may include an “expert panel” discussion forum. I think this could prove to be an excellent idea, though of course it’s highly dependent on the quality of questions discussed. Also there is the possibility of doing something on virtualisation – which is something I’m very interested in at the minute.
So onto the survey and as always most people have 2 nodes in a cluster with a handful on 3, 4. The highest in attendance today was 6 nodes. Vast majority running Linux and 64-bit, apart from a somewhat fed-up Martin Bach having the misfortune to be on 32-bit Linux. Not one person running RAC on windows, even though there was a presentation on running RAC on windows. A good quantity running RAC on Solaris SPARC.
The vast majority of users are using 10.2 with a handful having upgraded to 11.1, most people using Fibre Channel connectivity with HP being the most popular storage vendor, though EMC close behind, and netapp, IBM, Sun all on similar ammounts. A forest of hands for people using ASM, this is by far and away the most popular choice. A handful using Veritas, but again majority just using clusterware with no 3rd party clusterware.
As for App servers, Weblogic quite popular along with Oracle App Server. Only a few on tomcat, vast, vast majority using home written apps on their RAC clusters. A good amount using physical standby, one guy running active dataguard. Almost everyone using oem. Only 1 person has used Real Application Testing with their RAC install.
Oracle Support Update – Phil Davies
Thing I took away was the increasing importance of patch bundles, several of which were mentioned, in particular ones for Physical/Logical Standby. Phil had a throwaway point that he thought 10.2.0.5 (terminal 10.2 release) was likely next spring. This seems incredible to me as 10.2.0.4 was released (for x86-64) in March 2008 so that would be a 2 year gap. Seems quite astonishing, it’s not like there are not already enough one of patches to make up a patchset I think there are already 500+ one of patches on top of 10.2.0.4.
Complete Upgrade to 11g – Karen Ambrose
This presentation detailed how Karen got on upgrading a 10.2.0.2 cluster up to 184.108.40.206, including Clusterware, ASM, and the RDBMS. She chose to use DBUA and it all seemed to work fairly well, apart from the ASM instance which she said she had to upgrade manually. They also then went upto 220.127.116.11 with an ASM rolling migration. Strange point that they are using multiple listeners, one for ASM and one for the RDBMS. Joel Goodman pointed out this was unnecessary, and I have to say there did not seem to be a compelling reason for running the 2 listeners. The wanted to go to 11g to utilise the secure files feature. They have separate homes for the ASM and RDBMS instances, and have multiple RDBMS versions.
Whats the Point of Oracle Checkpoints? – Harald Van Breederode
For me, this was the stand out presentation of the day. What in the face of it is a fairly mundane topic was explained in such a clear way, with outstanding demonstrations.
SGA Buffer management via double linked lists. List for buffers in use, and list for buffers that available to be used for i/o – this means full SGA does not have to be scanned to find free buffers. Another double linked list is the ckpt-queue.
A checkpoint is a synchronisation event There are lots of different types of checkpoints:
This writes all dirty buffers from all instances.
This does all dirty buffers from one instance in RAC cluster.
This writes dirty buffers belonging to one tablespace. Caused by taking a tablespace offline, setting it read only or a begin backup statement.
Parallel query does direct reads and this causes dirty buffers to be written out in a parallel query checkpoint. Can be odd that a query can cause lots of write activity.
An object checkpoint can come about due to drop table and truncate table commands. Caused by the need for Point In Time Recovery.
Incremental checkpointing writes some of the contents out from the CKPT-queue continually, the idea is to dribble out the writing of the dirty buffers rather than having a checkpoint occur in one big splurge. Ensures the fast_start_mttr_target parameter can be maintained.
Sizing your redo logfiles is critical for keeping checkpoints under control. Harald was emphasizing that you can’t make your redo logs too big but you can make them too small. Point was made about using archive_lag_target to control how frequently you have a log switch when you have very large log files.
V$instance_recovery can be very useful in finding out what is driving your checkpointing and what is an optimal size for your redo logs.
Harald stated that in some future version of Oracle LGWR will be the only mechanism for sending redo to a standby no more ARCH – though if you have a network disconnect and have to ship archived redo, not sure how it will do that with LGWR.
Achieving High Availability using Open Source Technology – Andrew Hughes
This talk was how they were using OCFS2 to provide shared storage and connect in multiple nodes, but then use cold failover to provide some form of high availability of instances. It kinda did what they set out to achieve, but as Joel Goodman pointed out, so much more could have been achieved using Oracle Clusterware to automatically restart failed instances and would provide VIPS to avoid the net timeout issue.
Oracle 11g RAC On Windows – Dave Bennet
I think Dave had a bit of an uphill struggle here. Not a single person in the room had a RAC database running on windows, and the main thrust of the presentation seemed to be issues encountered in running on 32-bit Windows, and SE edition of windows at that.
Database Links Masterclass – Joel Goodman
Attending a Joel Goodman presentation really is like trying to drink from a fire hydrant. Joel’s knowledge is truly immense, both in breadth and depth. This was no exception, though it was a shame he ran out of time with the demo, and I think instead of saving the demo all up for the end, that doing a similar thing to Harald and presenting some theory, then a demo of that theory, rather than attempting to digest all the theory may have been a bit easier on the audience!
The next RAC & HA SIG will be on 10th September in Blythe Valley Park in the West Midlands. Hope to see you there.