I’m a big fan of accessing the database via services and there are some nice new features with database services in 11gR2. However I got a nasty shock when performing some patch maintenance with an 11.2.0.1 RAC system that had applications using services. Essentially I did not realise what happens to a service when you shutdown an instance for maintenance. Let me demonstrate:
oracle@linuxrac1:DBA1 ~> srvctl add service -d DBA -s DBA_TEST -P BASIC -e SELECT -r "DBA1" -a "DBA2" oracle@linuxrac1:DBA1 ~> srvctl start service -d DBA -s dba_test
This has the following configuration:
oracle@linuxrac1:DBA1 ~> srvctl config service -d DBA -s DBA_TEST Service name: DBA_TEST Service is enabled Server pool: DBA_DBA_TEST Cardinality: 1 Disconnect: false Service role: PRIMARY Management policy: AUTOMATIC DTP transaction: false AQ HA notifications: false Failover type: SELECT Failover method: NONE TAF failover retries: 0 TAF failover delay: 0 Connection Load Balancing Goal: LONG Runtime Load Balancing Goal: NONE TAF policy specification: BASIC Preferred instances: DBA1 Available instances: DBA2
So the service is now online on the node where DBA1 (preferred node in definition) runs:
oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
1 ONLINE ONLINE linuxrac1
any examples I’ve seen, show what happens to a service when you perform shutdown abort. First lets see what our tns connection looks like:
DBATEST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linuxrac.nominet.org.uk)(PORT = 1521))
)
(LOAD_BALANCE=no)
(CONNECT_DATA =
(SERVICE_NAME = DBA_TEST)
(FAILOVER_MODE =
(TYPE=SELECT)
(METHOD=BASIC)
)
)
)
Which gives the following in V$SESSION when you connect using this definition:
system@DBATEST> select failover_type, failover_method, failed_over 2 from v$session 3 where service_name='DBA_TEST'; FAILOVER_TYPE FAILOVER_M FAI ------------- ---------- --- SELECT BASIC NO
Lets abort the node:
oracle@linuxrac1:DBA1 ~> srvctl stop instance -d DBA -i DBA1 -o abort
oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
1 ONLINE OFFLINE
Oh that’s not good. Look whats happened to my application:
system@DBATEST> / select * from v$instance * ERROR at line 1: ORA-03113: end-of-file on communication channel ERROR: ORA-03114: not connected to ORACLE Elapsed: 00:00:00.01
Let’s bring everything back and try a different kind of shutdown This time using the following:
oracle@linuxrac1:DBA1 ~> srvctl stop instance -d DBA -i DBA1 -f
Then we see the following behaviour with the connected “application”:
system@DBATEST> / FAILOVER_TYPE FAILOVER_M FAI ------------- ---------- --- SELECT BASIC NO Elapsed: 00:00:00.01 system@DBATEST> / FAILOVER_TYPE FAILOVER_M FAI ------------- ---------- --- SELECT BASIC YES Elapsed: 00:00:00.50
So there was a slight pause in querying when the failover occurred but you can see it has failed over correctly also the service is still up and running and failed over:
oracle@linuxrac1:DBA1 ~> crsctl stat res "ora.dba.dba_test.svc" -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.dba.dba_test.svc
1 ONLINE ONLINE linuxrac2
So that worked fine, when including the -f option.
Another way of doing maintenance to the server node while still having the applications running without interruption is to relocate the service first and then shutdown the instance:
oracle@linuxrac1:DBA1 ~> srvctl relocate service -d DBA -s DBA_TEST -i DBA1 -t DBA2
Then your application again just has a short pause when the instance shuts down but shows:
select failover_type, failover_method, failed_over from v$session where service_name='DBA_TEST' / FAILOVER_TYPE FAILOVER_M FAI ------------- ---------- --- SELECT BASIC NO Elapsed: 00:00:00.00 system@DBATEST> / FAILOVER_TYPE FAILOVER_M FAI ------------- ---------- --- SELECT BASIC YES Elapsed: 00:00:00.76
This is really my preferred method of doing maintenance that requires instance shutdown. Relocate the service to the other node and then perform the shutdown. Be careful to test how your services fail over upon shutdown and maintenance.

Martin
/ November 18, 2010Hi Jason,
I’m assuming that linuxrac.nominet.org.uk is your scan. Have you tried relying on the service’s TAF connection configuration, i.e. using this connection string:
DBATEST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = linuxrac.nominet.org.uk)(PORT = 1521))
)
(
CONNECT_DATA =
(SERVICE_NAME = DBA_TEST)
)
)
There’s something slightly odd about the output of the srvctl config service command. When you created the service you used the -P and -e options, yet the config states failover method “none”.
Regards,
Martin
jarneil
/ November 29, 2010Hi Martin,
Yes, I’ve got into a kerfuffle with the output from srvctl. I tried various parameters on creating the service. I will test your idea.
jason.
ps your comment got treated as spam!