11gR2 Database Services and Instance Shutdown

I’m a big fan of accessing the database via services and there are some nice new features with database services in 11gR2. However I got a nasty shock when performing some patch maintenance with an 11.2.0.1 RAC system that had applications using services. Essentially I did not realise what happens to a service when you shutdown an instance for maintenance. Let me demonstrate:


oracle@linuxrac1:DBA1 ~> srvctl add service -d DBA -s DBA_TEST -P BASIC -e SELECT -r "DBA1" -a "DBA2"

oracle@linuxrac1:DBA1 ~> srvctl start service -d DBA -s dba_test

This has the following configuration:



oracle@linuxrac1:DBA1 ~> srvctl config service -d DBA -s DBA_TEST
Service name: DBA_TEST
Service is enabled
Server pool: DBA_DBA_TEST
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Preferred instances: DBA1
Available instances: DBA2

So the service is now online on the node where DBA1 (preferred node in definition) runs:

oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
      1        ONLINE  ONLINE       linuxrac1

any examples I’ve seen, show what happens to a service when you perform shutdown abort. First lets see what our tns connection looks like:

DBATEST =
 (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = linuxrac.nominet.org.uk)(PORT = 1521))
     )
     (LOAD_BALANCE=no)
    (CONNECT_DATA =
      (SERVICE_NAME = DBA_TEST)
      (FAILOVER_MODE =
       (TYPE=SELECT)
       (METHOD=BASIC)
      )
    )
  )

Which gives the following in V$SESSION when you connect using this definition:


system@DBATEST> select failover_type, failover_method, failed_over                   
  2  from v$session 
  3  where service_name='DBA_TEST';

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT	      BASIC	 NO

Lets abort the node:

oracle@linuxrac1:DBA1 ~> srvctl stop instance -d DBA -i DBA1 -o abort
oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
      1        ONLINE  OFFLINE 

Oh that’s not good. Look whats happened to my application:

system@DBATEST> /
select * from v$instance
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel


ERROR:
ORA-03114: not connected to ORACLE


Elapsed: 00:00:00.01

Let’s bring everything back and try a different kind of shutdown This time using the following:

oracle@linuxrac1:DBA1 ~> srvctl stop instance -d DBA -i DBA1 -f

Then we see the following behaviour with the connected “application”:

system@DBATEST> /

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT	      BASIC	 NO

Elapsed: 00:00:00.01
system@DBATEST> /

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT	      BASIC	 YES

Elapsed: 00:00:00.50

So there was a slight pause in querying when the failover occurred but you can see it has failed over correctly also the service is still up and running and failed over:

oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
      1        ONLINE  ONLINE       linuxrac2 

So that worked fine, when including the -f option.

Another way of doing maintenance to the server node while still having the applications running without interruption is to relocate the service first and then shutdown the instance:

oracle@linuxrac1:DBA1 ~> srvctl relocate service -d DBA -s DBA_TEST -i DBA1 -t DBA2

Then your application again just has a short pause when the instance shuts down but shows:

select failover_type, failover_method, failed_over
from v$session
where service_name='DBA_TEST'
/

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT	      BASIC	 NO

Elapsed: 00:00:00.00
system@DBATEST> /

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT	      BASIC	 YES

Elapsed: 00:00:00.76

This is really my preferred method of doing maintenance that requires instance shutdown. Relocate the service to the other node and then perform the shutdown. Be careful to test how your services fail over upon shutdown and maintenance.

Advertisements

4 thoughts on “11gR2 Database Services and Instance Shutdown

  1. Hi Jason,

    I’m assuming that linuxrac.nominet.org.uk is your scan. Have you tried relying on the service’s TAF connection configuration, i.e. using this connection string:

    DBATEST =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = linuxrac.nominet.org.uk)(PORT = 1521))
    )
    (
    CONNECT_DATA =
    (SERVICE_NAME = DBA_TEST)
    )
    )

    There’s something slightly odd about the output of the srvctl config service command. When you created the service you used the -P and -e options, yet the config states failover method “none”.

    Regards,

    Martin

  2. Hi Martin,

    Yes, I’ve got into a kerfuffle with the output from srvctl. I tried various parameters on creating the service. I will test your idea.

    jason.

    ps your comment got treated as spam!

  3. Hi Jason,

    Good work. I’b been confused as to why my service had not been failing over until I read this.

    Giovanni

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s