Oracle OpenWorld – Undestanding RAC Internals

Barb Lundhild made my day today. This was a fantastic in-depth dive into 11gR2 RAC. Certainly the best RAC presentation I’ve attended this year. And to top it off, I bumped into Barb in Moscone West and had a great conversation about her presentation – she is super smart.

Barb started by asking the audience which versions of RAC they were running:

handful running 9i

vast majority running 10.2

some running 11.1

NO one running 11.2 yet

Upgrading to 11gR2

Clusterware & ASM now live in the same home “Grid Infrastucture”

This is an out of place upgrade – this home is owned by ROOT – except she is now saying that the owner of your current clusterware home should be the owner of the new grid infrastructure home

This is recommended to be on local file system

Upgrade Clusterware and ASM to 11.2 at the same time

The idea is having the sys admins to manage the infrastructure, clusterware and storage, while the DBA manages the database

seems to be saying clusterware is rolling upgradeable from 10.2.0.3 need to shutdown ASM, making the point ASM NOT rolling until 11.1

saying that Clusterware and ASM should be upgraded at the SAME TIME except you could do the clusterware part rolling, but must then have the downtime for ASM (from 10.2)

Apply DBCA patch bug 8288940

pin the node

crsctl pin css -n nodename

OPROCD has disappeared ! That did not last long

OCLSOMON, OCLSVMON no longer exist either,

hangcheck timer not required either on linux

Massive diagram Barb has claimed it was “nice”, it’s a total spaghetti mess. I guess they don’t have access to designers in Oracle.

New De-install utility

Grief when trying to clean up, in the oracle home there is now a de-install utility, because if you have failed to install oracle RAC you really need to clean up first before starting again.

stops anything running
unlocks the grid infrastructure home
resets OCR/voting disks
removes contents in system directories
cleans up various oracle directory

best not running that on the wrong system

Managing 11gR2 databases

two management styles:

administrator managed: where the dba defines what databases run on which node, and dba defines which services run where

policy managed: big thrust to policy managed clusters define resource requirements of workload, enough instances started to support workload. big thrust to grid goal is to remove hard coding of a service to a particular node. This is like virtualisation for databases, you have a pool of resources that you can run your instances on, but do you really care where your db’s are running within the pool. Currently requires nodes to be the same capacity, no way of saying one node has more resources than anbother. have ability to define the min and max number of nodes a db can run on

seems like you could define a server to be sitting round marked as free just waiting to leap into action should a node fail. Hmm, not sure that is a great licensing strategy.

generic pool used for administering the “old way”

for policy managed databases SIDs are Dynamic

SCAN

for policy managed you don’t know which node is going to be hosting your instance. SCAN allows your clients to connect to the database without having to change entries in the tns when you add/remove nodes.

define SCAN in your DNS:

RAC-scan.example.co.uk

IN A IP1
IN A IP2
IN A IP3

3 IPs are required

even in a 2 node cluster you need 3

you could instead use GNS

this could make it easier for JDBC connections to connect to any node in a cluster with a shorter connect string.

SCAN will work fine with dataguard

tns entries can now use connect_timeout and retry_count to ensure quick connections to either primary or standby

Multiple public networks

apparently many customers wanted this, each network must have a network resource defined by clusterware, init.ora parameter listener_networks

instances will register a service with all networks

Managing OCR and voting disks with ASM

OCR and vote can now be stored in ASM, compatible parameter must be 11.2.0.0

Best practice is to use same diskgroup as database

disk discovery string stored in GPnP profile

cannot stop ASM unless you use stop cluster

OCR stored in similiar way to database file

only 1 ocr in diskgroup

can have ocr in multiple diskgroups

voting disks are created on specific disks and CSS knows their location

number of voting disks depends on redundancy

if you use external will only have 1 disk group

you can’t as yet put voting disks in multiple disk groups – which seems a bit of a regression.

voting disks automtaically backed up.

taling about quorum failgroup to have a 3rd voting disk in a multi-san environment

voting disks are backed up into the OCR

do not use dd to backup voting disks

If ASM dies and your voting disk and OCR are in it, then NO the cluster does not reboot

although if it fails on the node that is the OCR master that node may reboot

grid plug ‘n’ play

makes cluster more dynamic, easier to add/remove nodes
gns – lets the cluster manage it’s own network

Advertisements

6 thoughts on “Oracle OpenWorld – Undestanding RAC Internals

  1. hi,

    Nice article.

    I have couple of doubts about SCAN.

    In a 2 node rac, do we need to have separate SCAN ip for both the 2 nodes ?

    How to implement scan in /etc/hosts ( when using vmware )

    Thx
    sunil

    • Hi Sunil,

      Thanks for reading.

      Yeah I had my doubts about that. You know, I think you have to have 3 SCAN listener ip’s. Even for 2 nodes.

      jason.

    • Guys,

      You are completly mistaken …

      SCAN IP’s will not be in your /etc/hosts file. They will be setup in DNS Server. Scan IP’s are not node specific … They are specific to your cluster …

      You should set your scan name in DNS a minimum on 1 scan IP to a maximum of 3 which will be picked up by client using round robin algorithm.

      Suppose if you have a 2 node RAC with 3 SCAN IP’s then one of your node will be running 2 SCAN listeners.

      Suppose if you have a 100 node RAC then 3 scan listeners will be running on any of the 3 nodes out of the 100 nodes in your cluster. You can’t control the scan listener on where to run in your cluster.

      I hope this helps … If you have any questions let e know ….

      Thanks,
      Sreekanth

      • Hi,

        You are right in production you should use DNS server or GNS, and not put entries into /etc/host. For testing/playing then it’s perfectly permissible to use /etc/hosts in your playground environment.

        You can control where your scan listener runs with the following:

        srvctl relocate scan_listener -i SCAN_ORDINAL_NUMBER -n TARGET_NODE

        where SCAN_ORDINAL_NUMBER is a number of the scan_listener you want to move i.e. [1-3]. and obviously TARGET_NODE is where you want to move that scan_listener to.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s