Oracle 12.1.0.2 Bundle Patching

I’ve spent a few days playing with patching 12.1.0.2 with the so called “Database Patch for Engineered Systems and Database In-Memory”. Lets skip over why these not necessarily related feature sets should be bundled together into effectively a Bundle Patch.

First I was testing going from 12.1.0.2.1 to BP2 or 12.1.0.2.2. Then as soon as I’d done that of course BP3 was released.

So this is our starting position with BP1:

GI HOME:

[oracle@rac2 ~]$ /u01/app/12.1.0/grid_1/OPatch/opatch lspatches
19392604;OCW PATCH SET UPDATE : 12.1.0.2.1 (19392604)
19392590;ACFS Patch Set Update : 12.1.0.2.1 (19392590)
19189240;DATABASE BUNDLE PATCH : 12.1.0.2.1 (19189240)

DB Home:

[oracle@rac2 ~]$ /u01/app/oracle/product/12.1.0.2/db_1/OPatch/opatch lspatches
19392604;OCW PATCH SET UPDATE : 12.1.0.2.1 (19392604)
19189240;DATABASE BUNDLE PATCH : 12.1.0.2.1 (19189240)

Simple enough, right? BP1 and the individual patch components within BP1 give you 12.1.0.2.1. Even I can follow this.

Lets try and apply BP2 to the above. We will use opatchauto for this, and to begin with we will run an analyze:

[root@rac2 ~]# /u01/app/12.1.0/grid_1/OPatch/opatchauto apply -analyze /tmp/BP2/19774304 -ocmrf /tmp/ocm.rsp 
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0/grid_1

opatchauto log file: /u01/app/12.1.0/grid_1/cfgtoollogs/opatchauto/19774304/opatch_gi_2014-12-18_13-35-17_analyze.log

NOTE: opatchauto is running in ANALYZE mode. There will be no change to your system.

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/12.1.0/grid_1
RAC home(s):
/u01/app/oracle/product/12.1.0.2/db_1

Configuration Validation: Successful

Patch Location: /tmp/BP2/19774304
Grid Infrastructure Patch(es): 19392590 19392604 19649591 
RAC Patch(es): 19392604 19649591 

Patch Validation: Successful

Analyzing patch(es) on "/u01/app/oracle/product/12.1.0.2/db_1" ...
Patch "/tmp/BP2/19774304/19392604" analyzed on "/u01/app/oracle/product/12.1.0.2/db_1" with warning for apply.
Patch "/tmp/BP2/19774304/19649591" analyzed on "/u01/app/oracle/product/12.1.0.2/db_1" with warning for apply.

Analyzing patch(es) on "/u01/app/12.1.0/grid_1" ...
Patch "/tmp/BP2/19774304/19392590" analyzed on "/u01/app/12.1.0/grid_1" with warning for apply.
Patch "/tmp/BP2/19774304/19392604" analyzed on "/u01/app/12.1.0/grid_1" with warning for apply.
Patch "/tmp/BP2/19774304/19649591" analyzed on "/u01/app/12.1.0/grid_1" with warning for apply.

SQL changes, if any, are analyzed successfully on the following database(s): TESTRAC

Apply Summary:

opatchauto ran into some warnings during analyze (Please see log file for details):
GI Home: /u01/app/12.1.0/grid_1: 19392590, 19392604, 19649591
RAC Home: /u01/app/oracle/product/12.1.0.2/db_1: 19392604, 19649591

opatchauto completed with warnings.

Well, that does not look promising. I have no “one-off” patches in this home to cause a conflict, it should be a simple BP1->BP2 patching without any issues.

Digging into the logs we find the following:

.
.
.
[18-Dec-2014 13:37:08]       Verifying environment and performing prerequisite checks...
[18-Dec-2014 13:37:09]       Patches to apply -> [ 19392590 19392604 19649591  ]
[18-Dec-2014 13:37:09]       Identical patches to filter -> [ 19392590 19392604  ]
[18-Dec-2014 13:37:09]       The following patches are identical and are skipped:
[18-Dec-2014 13:37:09]       [ 19392590 19392604  ]
.
.

Essentially out of the 3 patches in the home at BP1 only the Database Bundle Patch 19189240 is superseded by BP2. Maybe this annoys me more than it should. I like my patches applied by BP2 to end in 2. I also don’t like the fact the analyze throws a warning about this.

Lets patch:

[root@rac2 ~]# /u01/app/12.1.0/grid_1/OPatch/opatchauto apply /tmp/BP2/19774304 -ocmrf /tmp/ocm.rsp 
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0/grid_1

opatchauto log file: /u01/app/12.1.0/grid_1/cfgtoollogs/opatchauto/19774304/opatch_gi_2014-12-18_13-54-03_deploy.log

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/12.1.0/grid_1
RAC home(s):
/u01/app/oracle/product/12.1.0.2/db_1

Configuration Validation: Successful

Patch Location: /tmp/BP2/19774304
Grid Infrastructure Patch(es): 19392590 19392604 19649591 
RAC Patch(es): 19392604 19649591 

Patch Validation: Successful

Stopping RAC (/u01/app/oracle/product/12.1.0.2/db_1) ... Successful
Following database(s) and/or service(s)  were stopped and will be restarted later during the session: testrac

Applying patch(es) to "/u01/app/oracle/product/12.1.0.2/db_1" ...
Patch "/tmp/BP2/19774304/19392604" applied to "/u01/app/oracle/product/12.1.0.2/db_1" with warning.
Patch "/tmp/BP2/19774304/19649591" applied to "/u01/app/oracle/product/12.1.0.2/db_1" with warning.

Stopping CRS ... Successful

Applying patch(es) to "/u01/app/12.1.0/grid_1" ...
Patch "/tmp/BP2/19774304/19392590" applied to "/u01/app/12.1.0/grid_1" with warning.
Patch "/tmp/BP2/19774304/19392604" applied to "/u01/app/12.1.0/grid_1" with warning.
Patch "/tmp/BP2/19774304/19649591" applied to "/u01/app/12.1.0/grid_1" with warning.

Starting CRS ... Successful

Starting RAC (/u01/app/oracle/product/12.1.0.2/db_1) ... Successful

SQL changes, if any, are applied successfully on the following database(s): TESTRAC

Apply Summary:

opatchauto ran into some warnings during patch installation (Please see log file for details):
GI Home: /u01/app/12.1.0/grid_1: 19392590, 19392604, 19649591
RAC Home: /u01/app/oracle/product/12.1.0.2/db_1: 19392604, 19649591

opatchauto completed with warnings.

I do not like to see warnings when I’m patching. The log file for the apply is similar to the analyze, identical patches skipped.

Checking where we are with GI and DB patches now:

[oracle@rac2 ~]$ /u01/app/12.1.0/grid_1/OPatch/opatch lspatches
19649591;DATABASE BUNDLE PATCH : 12.1.0.2.2 (19649591)
19392604;OCW PATCH SET UPDATE : 12.1.0.2.1 (19392604)
19392590;ACFS Patch Set Update : 12.1.0.2.1 (19392590)

[oracle@rac2 ~]$ /u01/app/oracle/product/12.1.0.2/db_1/OPatch/opatch lspatches
19649591;DATABASE BUNDLE PATCH : 12.1.0.2.2 (19649591)
19392604;OCW PATCH SET UPDATE : 12.1.0.2.1 (19392604)

The only one changed is the DATABASE BUNDLE PATCH.

The one MOS document I effectively have on “speed dial” is 888828.1 and that showed up BP3 as being available 17th December. It also had the following warning:

Before install on top of 12.1.0.2.1DBBP or 12.1.0.2.2DBBP, first rollback patch 19392604 OCW PATCH SET UPDATE : 12.1.0.2.1

[root@rac2 ~]# /u01/app/12.1.0/grid_1/OPatch/opatchauto apply -analyze /tmp/BP3/20026159 -ocmrf /tmp/ocm.rsp 
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0/grid_1

opatchauto log file: /u01/app/12.1.0/grid_1/cfgtoollogs/opatchauto/20026159/opatch_gi_2014-12-18_14-13-58_analyze.log

NOTE: opatchauto is running in ANALYZE mode. There will be no change to your system.

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/12.1.0/grid_1
RAC home(s):
/u01/app/oracle/product/12.1.0.2/db_1

Configuration Validation: Successful

Patch Location: /tmp/BP3/20026159
Grid Infrastructure Patch(es): 19392590 19878106 20157569 
RAC Patch(es): 19878106 20157569 

Patch Validation: Successful
Command "/u01/app/12.1.0/grid_1/OPatch/opatch prereq CheckConflictAgainstOH -ph /tmp/BP3/20026159/19878106 -invPtrLoc /u01/app/12.1.0/grid_1/oraInst.loc -oh /u01/app/12.1.0/grid_1" execution failed

Log file Location for the failed command: /u01/app/12.1.0/grid_1/cfgtoollogs/opatch/opatch2014-12-18_14-14-50PM_1.log

Analyzing patch(es) on "/u01/app/oracle/product/12.1.0.2/db_1" ...
Patch "/tmp/BP3/20026159/19878106" analyzed on "/u01/app/oracle/product/12.1.0.2/db_1" with warning for apply.
Patch "/tmp/BP3/20026159/20157569" analyzed on "/u01/app/oracle/product/12.1.0.2/db_1" with warning for apply.

Analyzing patch(es) on "/u01/app/12.1.0/grid_1" ...
Command "/u01/app/12.1.0/grid_1/OPatch/opatch napply -phBaseFile /tmp/OraGI12Home2_patchList -local  -invPtrLoc /u01/app/12.1.0/grid_1/oraInst.loc -oh /u01/app/12.1.0/grid_1 -silent -report -ocmrf /tmp/ocm.rsp" execution failed: 
UtilSession failed: After skipping conflicting patches, there is no patch to apply.

Log file Location for the failed command: /u01/app/12.1.0/grid_1/cfgtoollogs/opatch/opatch2014-12-18_14-15-30PM_1.log

Following step(s) failed during analysis:
/u01/app/12.1.0/grid_1/OPatch/opatch prereq CheckConflictAgainstOH -ph /tmp/BP3/20026159/19878106 -invPtrLoc /u01/app/12.1.0/grid_1/oraInst.loc -oh /u01/app/12.1.0/grid_1
/u01/app/12.1.0/grid_1/OPatch/opatch napply -phBaseFile /tmp/OraGI12Home2_patchList -local  -invPtrLoc /u01/app/12.1.0/grid_1/oraInst.loc -oh /u01/app/12.1.0/grid_1 -silent -report -ocmrf /tmp/ocm.rsp


SQL changes, if any, are analyzed successfully on the following database(s): TESTRAC

Apply Summary:

opatchauto ran into some warnings during analyze (Please see log file for details):
RAC Home: /u01/app/oracle/product/12.1.0.2/db_1: 19878106, 20157569

Following patch(es) failed to be analyzed:
GI Home: /u01/app/12.1.0/grid_1: 19392590, 19878106, 20157569

opatchauto analysis reports error(s).

Looking at the log file we see patch 19392604 already in the home conflicts with patch 19878106 from BP3. 19392604 is the OCW PATCH SET UPDATE in BP1 (and BP2) while 19878106 is the Database Bundle Patch in BP3. We see the following in the log file:

Patch 19878106 has Generic Conflict with 19392604. Conflicting files are :
                             /u01/app/12.1.0/grid_1/bin/diskmon

That seems messy. It definitely annoys me that to apply BP3 I have to take additional steps of rolling back a pervious BP. I don’t recall having to do this with previous Bundle Patches, and I’ve applied a fair few of them.

I rolled the lot back with opatchauto rollback. Then applied BP3 ontop of the unpatched homes I was left with. But lets look at what BP3 on top of 12.1.0.2 gives you:

GI Home:

[oracle@rac1 ~]$ /u01/app/12.1.0/grid_1/OPatch/opatch lspatches
20157569;OCW Patch Set Update : 12.1.0.2.1 (20157569)
19878106;DATABASE BUNDLE PATCH: 12.1.0.2.3 (19878106)
19392590;ACFS Patch Set Update : 12.1.0.2.1 (19392590)

DB Home:

[oracle@rac1 ~]$ /u01/app/12.1.0/grid_1/OPatch/opatch lspatches
20157569;OCW Patch Set Update : 12.1.0.2.1 (20157569)
19878106;DATABASE BUNDLE PATCH: 12.1.0.2.3 (19878106)

So for BP2 we had patch 19392604 OCW PATCH SET UPDATE : 12.1.0.2.1 Now we still have a 12.1.0.2.1 OCW Patch Set Update with BP3 but it has a new patch number!

That really irritates.

Exadata Shellshock: IB Switches Vulnerable

Andy Colvin has the lowdown on the Oracle response and fixes for the bash shellshock vulnerability.

However, when I last looked it seemed Oracle had not discussed anything regarding the IB switches being vulnerable.

The IB switches have bash running on them and Oracle have verified the IB switches are indeed vulnerable.


[root@dm01dbadm01 ~]# ssh 10.200.131.22

root@10.200.131.22's password:

Last login: Tue Sep 30 22:46:41 2014 from dm01dbadm01.e-dba.com

You are now logged in to the root shell.

It is recommended to use ILOM shell instead of root shell.

All usage should be restricted to documented commands and documented

config files.

To view the list of documented commands, use "help" at linux prompt.

[root@dm01sw-ibb0 ~]# echo $SHELL

/bin/bash

[root@dm01sw-ibb0 ~]# rpm -qf /bin/bash

bash-3.2-21.el5

We have fixed up, as instructed by Oracle, our compute nodes and the test then shows the following once you are no longer vulnerable to the exploit:

env 'x=() { :;}; echo vulnerable' 'BASH_FUNC_x()=() { :;}; echo vulnerable' bash -c "echo test"
bash: warning: x: ignoring function definition attempt
bash: error importing function definition for `BASH_FUNC_x'
test

Note the lack of “vulnerable” in the output.

Unfortunately when we come to run on the IB switches:


[root@dm01sw-ibb0 ~]# env 'x=() { :;}; echo vulnerable' 'BASH_FUNC_x()=() { :;}; echo vulnerable' bash -c "echo test"
vulnerable
bash: BASH_FUNC_x(): line 0: syntax error near unexpected token `)'
bash: BASH_FUNC_x(): line 0: `BASH_FUNC_x() () { :;}; echo vulnerable'
bash: error importing function definition for `BASH_FUNC_x'
test
[root@dm01sw-ibb0 ~]# bash: warning: x: ignoring function definition attempt
-bash: bash:: command not found
[root@dm01sw-ibb0 ~]# bash: error importing function definition for `BASH_FUNC_x'
> test
> 

It’s vulnerable. As apparently is the iLOM. There are as yet no fixes available for either of these.

Exadata: What’s Coming

This is based on the presentation Juan Loaiza gave regarding What’s new with Exadata. While a large part of the presentation focussed on what was already available, there are quite a few interesting new features that are coming down the road.

First of was a brief mention of the hardware. I’m less excited about this. The X4 has plenty of the hardware that you could want: CPU, memory and flash. You’d expect some or all of them to be bumped in the next generation.

New Hardware

This was skated over fairly quickly, but I expect an Exadata X5 in a few months. The X4 was released back in December 2013, first X4 I saw was January 2014. I wouldn’t be surprised if Oracle release the X5 on or around the anniversary of that release.

Very little was said about the new hardware that would be in the X5 except that the development cycle has followed what intel has released, and that cpu cores have gone up and flash capacity has gone up. No word was said on what CPU is going to be used on the X5.

The compute nodes on an X4-2 have Intel E5-2697 v2 chips this is a 12 core chip running at 2.7GHz. I’d expect an increase in core count. The X3 to X4 transition increased core count by 50%. If that happens again, we get to 18 cores. There is an Intel E5-2699 v3 with 18 cores but that’s clocked at 2.3GHz.

However, I think I’d be less surprised if they went with E5-2697 v3 which is 14 core chip clocked at 2.6GHz. That would be a far more modest increase in the number of cores. The memory speed available with this chip does go up though – it’s DDR4. Might help with In Memory option. I also wonder if they’ll bump the amount of memory supported – this chip (like the predecessor) can go to 768GB.

As I said, it was not mentioned which chip was going to be used, only that Intel had released new chips and that Oracle would be qualifying their use for Exadata over the coming months.

New Exadata Software

There was a bunch of interesting sounding new features coming down the road. Some of the ones that in particular caught my eye were:

The marketing friendly term “Exafusion”. Exafusion seems to be about speeding up OLTP, labelled as “Hardware Optimized OLTP Messaging” it’s a reimplementation of cache fusion. Messages bypass network stack leading to a performance improvement.

Columnar Flash Cache – This is Exadata automatically reformatting HCC data when written to flash as a pure column store for analytic workloads. Dual formats are stored.

Database snapshots on Exadata. This seems designed with pluggable databases in mind for producing fast clones for dev/test environments. Clearly something that was a gap with ASM as used on exadata, but ACFS does snapshots.

Currently the latest Linux release available on Exadata is 5.10. Upgrading across major releases is not supported – would have required reimaging. Not a pretty prospect. Thankfully Oracle are going to allow and enable upgrading in place to 6.5.

Some talk about reducing I/O outliers both in reading from hdd and in writing to flash.

Currently with IORM you can only enable or disable access to flash for a particular database. Full IORM seems to be coming for flash.

Final new feature that caught my eye was the long rumoured Virtualisation coming to Exadata. OVM is coming. The ODA for example has had VM capability for some time, so it’s in some ways an obvious extension. I’m expecting with the increasing number of cores lots of smaller organisations may not actually need all those cores and might think even if they could turn unused ones off, it’s a waste buying that hardware and not being able to use it.

I’m hoping to NOT see OVM on an Exadata in the wild anytime soon.

Software on Silicon

One final point almost tucked out of site, was that Juan had a little bullet point about “software on silicon”. Now this has me confused. My understanding is that when Larry was talking about this, it was specifically SPARC. That I can understand as Oracle controls what goes on the chip.

Ignoring the SPARC Supercluster, there is no SPARC on Exadata. So that leaves a closer collaboration with Intel or moving to SPARC. Collaborating closer with Intel is a possibility and Oracle had first dibs on the E7-8895 v2 for the X4-8.

I can’t imagine changing the compute nodes to SPARC that wouldn’t make sense. But “software on silicon” is a bit like offloading…

Exadata software is definitely keeping moving forward and the difference between running Oracle on Exadata compared with non-exadata is growing ever wider with each “exadata only” feature.

Online patching: The Good, the Bad, and the Ugly

I’ve worked on 24×7 systems for more than a decade, and I have a real dislike of downtime. For one, it can be a real pain to agree any downtime with the business, and while RAC can and does help when you do work in a rolling fashion, there is still risk.

The promise of online patching has been a long one, and it is only recently that I dipped my toe in the water with them. Unfortunately, they are not a panacea, and in this blog posting I’m going to share some of the downsides.

Of course not all patches are online, if they are the README associated with the patch will have an online section in how to apply the patch, also when you uncompress the patch there will be an directory called online.

The Good

So first for the good side, the actual application truly can be done online, in that sense it does what it says on the tin. Here I’m running from the unzipped patch directory, and in this example I’m using patch 10219624:

bash-3.2$ /u01/app/oracle/product/11.2.0/db_1/OPatch/opatch apply online -connectString TESTRAC1 -ocmrf /tmp/ocm.rsp 
Oracle Interim Patch Installer version 11.2.0.3.3
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.3
OUI version       : 11.2.0.3.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/10219624_Jan_24_2013_08_54_08/apply2013-01-24_08-54-08AM_1.log

Applying interim patch '10219624' to OH '/u01/app/oracle/product/11.2.0/db_1'
Verifying environment and performing prerequisite checks...
All checks passed.
Backing up files...

Patching component oracle.rdbms, 11.2.0.3.0...
Installing and enabling the online patch 'bug10219624.pch', on database 'TESTRAC1'.


Verifying the update...

Patching in all-node mode.

Updating nodes 'rac2' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Patch 10219624 successfully applied
Log file location: /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/10219624_Jan_24_2013_08_54_08/apply2013-01-24_08-54-08AM_1.log

OPatch succeeded.

I’m applying this to a 2 node 11gR2 RAC cluster. You’ll notice that it is applied on ALL nodes. You can’t apply an online patch in RAC to just one node at a time and you can’t rollback one node at a time either. Also be aware that while the patch is in the oracle home on all nodes in the cluster, it’s only been applied to the local instance

Now, I know you are meant to give connection string details like username/password, and can then apply to all instances in a cluster at the same time, but on some systems I work on, I do not have this information, and rely on OS authentication only. This can lead to some pain.

You can tell a patch is applied with the following:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  ENABLED

However, on the remote node:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
No patches currently installed

I accept this need not arise if you are able to authenticate properly at installation time. To fix this up you can do the following:

-bash-3.2$ /u01/app/oracle/product/11.2.0/db_1/OPatch/opatch util enableonlinepatch -connectString TESTRAC2 -id 10219624
Oracle Interim Patch Installer version 11.2.0.3.3
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.3
OUI version       : 11.2.0.3.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2013-01-24_09-47-08AM_1.log

Invoking utility "enableonlinepatch"
Installing and enabling the online patch 'bug10219624.pch', on database 'TESTRAC2'.


OPatch succeeded.

The Bad
I’ve found rolling back to be slightly more problematic on the remote with o/s authentication. The rollback always removed the patch from the home across all nodes and always removed it from the instance on the local node. While there is an opatch method documented to then stop being enabled in an instance, very similar to the enableonlinepatch above (it’s Disableonlinepatch) I found it did not work with some patches, though opatch reported success, the patch was still enabled.

Another point to note, restarting an instance does not remove an online applied patch, there is a directory under the $ORACLE_HOME, called hpatch that has the online applied patch libraries.

To get round this I had to resort to the following oradebug commands:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  ENABLED

SQL> oradebug patch disable bug10219624.pch
Statement processed.
SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  DISABLED

SQL> oradebug patch remove bug10219624.pch
Statement processed.
SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  REMOVED

That oradebug patch list showing removed then reverts to “No patches currently installed” upon instance restart.

The Ugly

This really caught me out, patches applied online are completely incompatible with a subsequent running of opatch auto. I had the situation recently whereby I had applied a patch online and then later wanted to run opatch auto to apply further patches. Before running opatch auto I always run the check for conflicts, and this did not give me a clue that opatch auto would not work with the online applied patch.

However when I ran opatch auto on Bundle Patch 11 the following occurred:

[Jan 16, 2013 9:19:16 AM]    OUI-67303:
                             Patches [   14632268   12880299   13734832 ] will be rolled back.
[Jan 16, 2013 9:19:16 AM]    Do you want to proceed? [y|n]
[Jan 16, 2013 9:19:19 AM]    Y (auto-answered by -silent)
[Jan 16, 2013 9:19:19 AM]    User Responded with: Y
[Jan 16, 2013 9:19:19 AM]    OPatch continues with these patches:   14474780
[Jan 16, 2013 9:19:19 AM]    OUI-67073:UtilSession failed:
                             OPatch cannot roll back an online patch while applying a regular patch.
                             Please rollback the online patch(es) " 14632268" manually, and then apply the regular patch(es) " 14474780".
[Jan 16, 2013 9:19:19 AM]    --------------------------------------------------------------------------------
[Jan 16, 2013 9:19:19 AM]    The following warnings have occurred during OPatch execution:
[Jan 16, 2013 9:19:19 AM]    1) OUI-67303:
                             Patches [   14632268   12880299   13734832 ] will be rolled back.
[Jan 16, 2013 9:19:19 AM]    --------------------------------------------------------------------------------
[Jan 16, 2013 9:19:19 AM]    Finishing UtilSession at Wed Jan 16 09:19:19 GMT 2013
[Jan 16, 2013 9:19:19 AM]    Log file location: /u01/app/ora/product/11.2.0.3/db_1/cfgtoollogs/opatch/opatch2013-01-16_09-19-08AM_1.log
[Jan 16, 2013 9:19:19 AM]    Stack Description: java.lang.RuntimeException:
                             OPatch cannot roll back an online patch while applying a regular patch.
                             Please rollback the online patch(es) " 14632268" manually, and then apply the regular patch(es) " 14474780"

Yes, it’s not that difficult to fix up, the frustrating thing here is the prerequisite checks did not show any issues. It’s pretty clear that the opatch auto developers have not given any thought how to properly handle an online applied patch, or the online patching developers have not considered the consequences of online patching with a future opatch auto.

Online patching is almost like the holy grail, nobody wants downtime, but I just don’t think the current online patching technique is quite fully there yet, and it really doesn’t play at all with opatch auto.

Observing Exadata HCC compression changes when adding columns

This blog posting is very much a follow on from the previous entry on how data compressed with Exadata HCC compression behaves under changing table definitions. Many thanks to Greg Rahn for the comments on the previous blog entry on a simple mechanism for determining whether the compression level has changed or not.

In this blog posting we add a column to an HCC compressed table and we observe whether the number of blocks in the table changes or not.

As Greg stated in the comments on the previous blog entry, we have 3 possibilities for adding a column:

  1. add column
  2. add column with a default value
  3. add column with a default value but also specify as not null

We start with the same table as in the previous entry:

SQL : db01> create table t_part (
username varchar2(30),
user_id number,
created date )
partition by range (created)
(partition p_2009 values less than (to_date('31-DEC-2009', 'dd-MON-YYYY')) tablespace users,
partition p_2010 values less than (to_date('31-DEC-2010', 'dd-MON-YYYY')) tablespace users,
partition p_2011 values less than (to_date('31-DEC-2011', 'dd-MON-YYYY')) tablespace users,
partition p_2012 values less than (to_date('31-DEC-2012', 'dd-MON-YYYY')) tablespace users )

/

Table created.

SQL : db01> alter table t_part compress for query high

/

Table altered.
SQL : db01> insert /*+ APPEND */ into t_part select * from all_users

488 rows created.

SQL : db01> commit;

Commit complete.

So now we gather stats on the table and see how many blocks the table is consuming:

SQL : db01> exec DBMS_STATS.gather_table_stats(ownname => 'SYS',tabname => 'T_PART', estimate_percent => 100);
PL/SQL procedure successfully completed.

SQL : db01> select table_name, blocks, empty_blocks, avg_row_len , last_analyzed from dba_tables where table_name='T_PART';

TABLE_NAME BLOCKS EMPTY_BLOCKS   AVG_ROW_LEN LAST_ANAL
---------- ------ -------------- ---------- ------------
T_PART        60      0          20         18-MAY-12

This will be our starting point for each of the 3 ways of adding a column. We will always start with this table consuming 60 blocks. We will then add the column and then determine how many blocks the table is consuming after the column has been added.

If the table has undergone decompression from HCC compression the number of blocks will go up, conversely if it has not then the number of blocks will remain static.

First we try just adding a column, no default value:

SQL : db01> alter table t_part add city varchar2(30);

Table altered.

SQL : db01> exec DBMS_STATS.gather_table_stats(ownname => 'SYS', tabname => 'T_PART', estimate_percent => 100);

PL/SQL procedure successfully completed.
SQL : db01> select table_name, blocks, empty_blocks, avg_row_len , last_analyzed from dba_tables where table_name='T_PART';

TABLE_NAME BLOCKS EMPTY_BLOCKS AVG_ROW_LEN LAST_ANAL
---------- ------ ----------  ---------- ------------
T_PART        60      0          20         18-MAY-12

So this method has not updated the number of blocks. It’s just a dictionary change. We then drop the table with the purge option and recreate it back to the starting point of 60 blocks. Next we try adding the column with a default value:

SQL : db01> alter table t_part add city varchar2(30) default 'Oxford';
Table altered.
SQL : db01> exec DBMS_STATS.gather_table_stats(ownname => 'SYS', tabname => 'T_PART', estimate_percent => 100);

PL/SQL procedure successfully completed.
SQL : db01>select table_name, blocks, empty_blocks, avg_row_len , last_analyzed from dba_tables where table_name='T_PART';

TABLE_NAME   BLOCKS  EMPTY_BLOCKS  AVG_ROW_LEN  LAST_ANAL
------------ ------ ------------    ---------- -----------
T_PART        192       0             27       18-MAY-12

As we can see the number has absolutely rocketed up from 60 to 192, this is indicative of the fact the data is no longer compressed with HCC compression.

Finally we repeat adding a column with a default value, but this time including the not null condition:


SQL :  db01> alter table t_part add city varchar2(30) default 'Oxford' not null;

Table altered.

SQL :  db01>  exec DBMS_STATS.gather_table_stats(ownname => 'SYS', tabname => 'T_PART', estimate_percent => 100);

PL/SQL procedure successfully completed.
<pre>SQL : db01> select table_name, blocks, empty_blocks, avg_row_len , last_analyzed from dba_tables where table_name='T_PART';

TABLE_NAME BLOCKS EMPTY_BLOCKS   AVG_ROW_LEN LAST_ANAL
---------- ------ -------------- ---------- ------------
T_PART        60      0          20         18-MAY-12

We see that with thetechnique of including a not null clause on the add column with a default value that the number of blocks has not changed, and hence the data must still be HCC compressed, as confirmed with the DBMS_COMPRESSION.GET_COMPRESSION_TYPE procedure.

Essentially if you can have any column that you add to an HCC compressed table to be defined as  not null  then you can be sure that specifying a default value will not undo your HCC compression.

If you do need to allow nulls, then getting away without a default value would be best and perhaps only updating recent data rather than all historical data would at least preserve some data as being HCC compressed. Be aware that uncompressing HCC compressed obviously can lead to a large increase in your storage requirements.

Adding Columns and Exadata HCC compression

While everyone is aware of the issues of mixing EHCC compression and OLTP type activities, I had a customer who was interested in finding out what happens upon adding a column to a table that has EHCC compression enabled on it.

As I could not see any definitive statements in the documentation on this particular scenario I ran up some tests to see the behaviour.

First of all they are using partitioning by date range, so we create a partitioned table:

SQL: db01> create table t_part  ( 
username varchar2(30), 
user_id  number, 
created date ) 
partition by range (created) 
( partition p_2009 values less than (to_date('31-DEC-2009', 'dd-MON-YYYY')) tablespace users, 
partition p_2010 values less than (to_date('31-DEC-2010', 'dd-MON-YYYY')) tablespace users, 
partition p_2011 values less than (to_date('31-DEC-2011', 'dd-MON-YYYY')) tablespace users, 
partition p_2012 values less than (to_date('31-DEC-2012', 'dd-MON-YYYY')) tablespace users )

/

Table created

The customer is particularly interested in using partitioning for ILM type scenarios in that they will compress historical partitions but not more up-to-date ones. Lets enable HCC compression on the table and check that it’s on:


SQL: db01> alter table t_part compress for query high 
/

Table altered

SQL: db01> select table_name, partition_name, compression, compress_for 
from all_tab_partitions 
where table_name='T_PART' 
/

TABLE_NAME                     PARTITION_NAME                 COMPRESS COMPRESS_FOR 
------------------------------ ------------------------------ -------- ------------ 
T_PART                         P_2009                         ENABLED  QUERY HIGH 
T_PART                         P_2010                         ENABLED  QUERY HIGH 
T_PART                         P_2011                         ENABLED  QUERY HIGH 
T_PART                         P_2012                         ENABLED  QUERY HIGH

Lets insert some data and check that the actual row is compressed (thanks to Kerry Osborne)


SQL: db01>; insert /*+ APPEND */ into t_part select * from all_users 
/ 
3008 rows created
SQL: db01> commit
/
Commit complete

SQL: db01> select max(rowid) from t_part
/

MAX(ROWID) 
------------------ 
AAAexSAANAAHGoUAAN

SQL: db01> select decode( 
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&amp;rowid'), 
    1, 'No Compression', 
    2, 'Basic/OLTP Compression', 
    4, 'HCC Query High', 
    8, 'HCC Query Low', 
   16, 'HCC Archive High', 
   32, 'HCC Archive Low', 
   'Unknown Compression Level') compression_type 
from dual;

Enter value for rowid: AAAexSAANAAHGoUAAN 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAexSAANAAHGoUAAN'),

COMPRESSION_TYPE 
------------------------- 
HCC Query High

So we are confident we have a row that is compressed. Now we add a new column to the table and give it a default value, we then check again what compression the row has:

SQL: db01> alter table t_part add city varchar2(30) default 'Oxford' 
/

Table altered.

select decode( 
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
  2    3  1, 'No Compression', 
  4  2, 'Basic/OLTP Compression', 
  5  4, 'HCC Query High', 
  6  8, 'HCC Query Low', 
  7  16, 'HCC Archive High', 
32, 'HCC Archive Low', 
    'Unknown Compression Level') compression_type 
from dual; 
Enter value for rowid: AAAexSAANAAHGoUAAN 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAexSAANAAHGoUAAN'),

COMPRESSION_TYPE 
------------------------- 
Basic/OLTP Compression

Oh Dear! Our compression has changed.

This maybe is not that surprising. But what if you have a requirement to add a column but with no default value, and you just want to update more recent records, can we avoid downgrading all records from HCC compression?

So we are using the same table and data as before. We will focus on two rows, one in the 2011 partition and one in the 2012 partition.

SQL: db01> select max(rowid) from t_part where created  > TO_DATE('31-Dec-2010', 'DD-MM-YYYY') and created < TO_DATE('01-Jan-2012', 'DD-MM-YYYY');

MAX(ROWID) 
------------------ 
AAAezbAAHAAFwIKAE/

SQL: db01> select decode( 
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
    1, 'No Compression', 
    2, 'Basic/OLTP Compression', 
    4, 'HCC Query High', 
    8, 'HCC Query Low', 
    16, 'HCC Archive High', 
    32, 'HCC Archive Low', 
    'Unknown Compression Level') compression_type 
from dual;  
Enter value for rowid: AAAezbAAHAAFwIKAE/ 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAezbAAHAAFwIKAE/'),

COMPRESSION_TYPE 
------------------------- 
HCC Query High

SQL: db01> select max(rowid) from t_part where created  > TO_DATE('31-Dec-2011', 'DD-MM-YYYY') and created < TO_DATE('31-Dec-2012', 'DD-MM-YYYY');

MAX(ROWID) 
------------------ 
AAAezcAAHAAHdoSADf

SQL:xldnc911001hdor:(SMALLDB1):PRIMARY> select decode( 
    DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
    1, 'No Compression', 
    2, 'Basic/OLTP Compression', 
    4, 'HCC Query High', 
    8, 'HCC Query Low', 
    16, 'HCC Archive High', 
    32, 'HCC Archive Low', 
    'Unknown Compression Level') compression_type 
from dual; 
Enter value for rowid: AAAezcAAHAAHdoSADf 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAezcAAHAAHdoSADf'),

COMPRESSION_TYPE 
------------------------- 
HCC Query High

Now we add a column to the table and update the records in only the 2012 partition:

SQL: db01> alter table t_part add city varchar2(30);

Table altered.

SQL: db01> update t_part set city='Oxford' where created > to_date('31-Dec-2011', 'DD-MM-YYYY');

448 rows updated.

SQL: db01> commit;

Commit complete.

And now we again check the compression status of our two rows:

SQL: db01> select decode( 
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
    1, 'No Compression', 
    2, 'Basic/OLTP Compression', 
    4, 'HCC Query High', 
    8, 'HCC Query Low', 
   16, 'HCC Archive High', 
   32, 'HCC Archive Low', 
       'Unknown Compression Level') compression_type 
from dual;  
Enter value for rowid: AAAezbAAHAAFwIKAE/ 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAezbAAHAAFwIKAE/'),

COMPRESSION_TYPE 
------------------------- 
HCC Query High

SQL: db01> select decode( 
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
    1, 'No Compression', 
    2, 'Basic/OLTP Compression', 
    4, 'HCC Query High', 
    8, 'HCC Query Low', 
    16, 'HCC Archive High', 
    32, 'HCC Archive Low', 
        'Unknown Compression Level') compression_type 
   from dual; 
Enter value for rowid: AAAezcAAHAAHdoSADf 
old   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', '&rowid'), 
new   2: DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( 'SYS', 'T_PART', 'AAAezcAAHAAHdoSADf'),

COMPRESSION_TYPE 
------------------------- 
Basic/OLTP Compression

So that is great, we have a way of evolving table definitions without having to suffer the whole set of historical data to not be in HCC compression.

Creating ASM diskgroups on Exadata with ASMCA

I recently had the chance to create some diskgroups on an Exadata box outside the standard installation procedure, while this is not necessarily Exadata specific, I thought the technique of using ASMCA silently on the command line to create the diskgroups was sufficiently novel for a short blog posting. If for nothing else but to remind myself on how to do this in future.

This example uses the disks presented from a quarter rack exadata system and creates a diskgroup called DATA01:

asmca -silent -createDiskGroup -diskGroupName 'DATA01' -diskList o/192.168.10.14/DATA01_CD_00_cel01,o/192.168.10.14/DATA01_CD_01_cel01,
o/192.168.10.14/DATA01_CD_02_cel01,o/192.168.10.14/DATA01_CD_03_cel01,
o/192.168.10.14/DATA01_CD_04_cel01,o/192.168.10.14/DATA01_CD_05_cel01,
o/192.168.10.14/DATA01_CD_06_cel01,o/192.168.10.14/DATA01_CD_07_cel01,
o/192.168.10.14/DATA01_CD_08_cel01,o/192.168.10.14/DATA01_CD_09_cel01,
o/192.168.10.14/DATA01_CD_10_cel01,o/192.168.10.14/DATA01_CD_11_cel01,
o/192.168.10.15/DATA01_CD_00_cel02,o/192.168.10.15/DATA01_CD_01_cel02,
o/192.168.10.15/DATA01_CD_02_cel02,o/192.168.10.15/DATA01_CD_03_cel02,
o/192.168.10.15/DATA01_CD_04_cel02,o/192.168.10.15/DATA01_CD_05_cel02,
o/192.168.10.15/DATA01_CD_06_cel02,o/192.168.10.15/DATA01_CD_07_cel02,
o/192.168.10.15/DATA01_CD_08_cel02,o/192.168.10.15/DATA01_CD_09_cel02,
o/192.168.10.15/DATA01_CD_10_cel02,o/192.168.10.15/DATA01_CD_11_cel02,
o/192.168.10.16/DATA01_CD_00_cel03,o/192.168.10.16/DATA01_CD_01_cel03,
o/192.168.10.16/DATA01_CD_02_cel03,o/192.168.10.16/DATA01_CD_03_cel03,
o/192.168.10.16/DATA01_CD_04_cel03,o/192.168.10.16/DATA01_CD_05_cel03,
o/192.168.10.16/DATA01_CD_06_cel03,o/192.168.10.16/DATA01_CD_07_cel03,
o/192.168.10.16/DATA01_CD_08_cel03,o/192.168.10.16/DATA01_CD_09_cel03,
o/192.168.10.16/DATA01_CD_10_cel03,o/192.168.10.16/DATA01_CD_11_cel03 -redundancy NORMAL -compatible.asm 11.2.0.0 -compatible.rdbms 11.2.0.0 -sysAsmPassword welcome1 -silent

Note I’ve edited the above to have returns after every couple of disks for readability. Of course you can specify your required redundancy level and differing compatibile parameters. You can see the 3 cells here that make up an Exadata 1/4 rack. You can also see the Infiniband IP addresses of each of these cells in the path to each griddisk.

You can see however the nice mapping between each of the griddisks names and the storage cells upon which they reside (e.g. DATA01_CD_00_cel01 – DATA01 griddisk created on Celldisk 00 of storage cel01) – I really like this naming feature of Exadata it makes life that little bit more straightforward for the administrator.

Only other thing to be aware of, when creating the DBFS_DG diskgroup all those CD_00, and CD_01 griddisks don’t exist as this is the storage on each cell is used for the systems disks.

UKOUG Exa Day

Had a great day at the UKOUG Exa.. Day yesterday. I was happy by and large with how my presentation went, it was a bit irritating having some technical issues with laptops and projectors, but hopefully the audience was entertained enough to not let that have annoyed them too much. I’ve included a link to the powerpoint of the presentation, you really need to read the notes to gain an understanding of what I was trying to say!

Apologies that it is a near 7MB download, but I’d be happy to answer any questions you may have on it, and all feedback welcomed.

As for the day itself, there was a great atmosphere at the event, and many, many familiar faces from UKOUG events of the past, in particular many faces from the RAC SIG. I particularly enjoyed Frits Hoogland’s presentation on Exadata and OLTP. He even alluded to the fact that Exadata smart flash logging may not be as beneficial to OLTP as you may be lead to believe. The other take away message from his presentation was to test everything – don’t take things for granted!

Tanel Poder was great to listen to and I think he could have gone on for many hours more talking about Exadata performance.

The day rounded off with great chat and beers for all the delegates courtesy of e-dba.

Exadata Smart Flash Logging

With the 11.2.2.4.0 release of the Exadata storage server software (and providing you are at least at 11.2.0.2 BP11), you will have the opportunity to utilise Exadata Smart Flash Logging. I thought I’d take a look at how much (if any) improvement this feature would provide to a busy production environment.

Have a look at this blog entry on Exadata Smart Flash Loggin by Luis Moreno Campos for an introduction on how it works. Basically you now issue two writes, 1 to flash 1 to your disk based redo logs, fastest write is the winner.

First lets check that we actually have some Exadata Smart Flash Log’s available to be used:

CellCLI> list flashlog detail

name                      cel07_FLASHLOG 
cellDisk FD_00_cel07,FD_08_cel07,FD_09_cel07,FD_01_cel07,FD_15_cel07,FD_06_cel07,FD_03_cel07,FD_04_cel07,FD_07_cel07,FD_02_cel07,FD_12_cel07,FD_14_cel07,FD_13_cel07,FD_11_cel07,FD_05_cel07,FD_10_cel07
 creationTime              2012-03-17T15
 degradedCelldisks 
 effectiveSize             512M 
 efficiency                100.0 
 id                        a24a25e5-062e-4be1-bb6b-3168113a5fe8 
 size                      512M 
 status                    normal

First we can see that on this cell, there is a flashlog created of size 512M. It is carved out of each the 16 flash doms in the cell. Consequently this reduces the amount you have for your flashcache, though it’s a very small reduction.

How much use are we getting out of the flashlog? Well, we can look at some metrics:

CellCLI> list metriccurrent where objectType='FLASHLOG'
 FL_ACTUAL_OUTLIERS                 FLASHLOG        0 IO requests 
 FL_BY_KEEP                         FLASHLOG        0 
 FL_DISK_FIRST                      FLASHLOG        6,168,815 IO requests 
 FL_DISK_IO_ERRS                    FLASHLOG        0 IO requests 
 FL_EFFICIENCY_PERCENTAGE           FLASHLOG        100 % 
 FL_EFFICIENCY_PERCENTAGE_HOUR      FLASHLOG        100 % 
 FL_FLASH_FIRST                     FLASHLOG        172,344 IO requests 
 FL_FLASH_IO_ERRS                   FLASHLOG        0 IO requests 
 FL_FLASH_ONLY_OUTLIERS             FLASHLOG        0 IO requests 
 FL_IO_DB_BY_W                      FLASHLOG        286,075 MB 
 FL_IO_DB_BY_W_SEC                  FLASHLOG        13.328 MB/sec 
 FL_IO_FL_BY_W                      FLASHLOG        303,793 MB 
 FL_IO_FL_BY_W_SEC                  FLASHLOG        13.761 MB/sec 
 FL_IO_W                            FLASHLOG        6,341,159 IO requests 
 FL_IO_W_SKIP_BUSY                  FLASHLOG        0 IO requests 
 FL_IO_W_SKIP_BUSY_MIN              FLASHLOG        0.0 IO/sec 
 FL_IO_W_SKIP_LARGE                 FLASHLOG        0 IO requests 
 FL_PREVENTED_OUTLIERS              FLASHLOG        415 IO requests

First off, this is taken on a very busy system:

FL_IO_FL_BY_W_SEC: 13.761 MB/sec

That is saying how much data is being written to flash by smart flash log. Well, that sounds great, but it’s not quite so simple. Remember writes go to both flash and disk.

FL_DISK_FIRST: 6,168,815 IO requests

This metric is actually telling that 6.1M I/O requests were serviced first by disk. While:

FL_FLASH_FIRST: 172,344 IO requests

Is saying this number went to flash first. Oh, that’s not a great improvement! I make that 2.7% writes went to flash first.

Finally a word on the FL_PREVENTED_OUTLIERS, this is saying there was 415 writes on this cell that would have taken more than 0.5 secs if there was no flash logging in place.

I have also checked AWR reports on before and after having flash logging in place. There is very little change. AWR shows an average wait of 3ms for Log File Parallel Write. Have a look at the wait event histogram for this:

We see the vast majority of writes are under a 1ms. This was also the case before the flash logging as well. It has not improved this at all.

This is a busy cpu bound system lets look at the Log file sync wait event histogram:

Eurgh! is the only way to describe this.

I think Kevin Closson has covered this a mere half-decade ago!

Exadata Flash Storage

Exadata flash storage is provided by the Sun Flash Accelerator F20 PCIe card shown above. Four of these cards are installed in every Exadata storage cell. There is a Documentation set available to peruse.

First, we can see these devices using lspci:

[root@cel01 ~]# lsscsi |grep  MARVELL 
[8:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdn 
[8:0:1:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdo 
[8:0:2:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdp 
[8:0:3:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdq 
[9:0:0:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdr 
[9:0:1:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sds 
[9:0:2:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdt 
[9:0:3:0]    disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdu 
[10:0:0:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdv 
[10:0:1:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdw 
[10:0:2:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdx 
[10:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdy 
[11:0:0:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdz 
[11:0:1:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdaa 
[11:0:2:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdab 
[11:0:3:0]   disk    ATA      MARVELL SD88SA02 D20Y  /dev/sdac

You can see they are bunched into 4 groups of 4 8:, 9:, 10:, and 11: This is the fact that the 4 cards each have 4 FMOD, so on every exadata the flash is presented as 16 separate devices.

We can also use the flash_dom command:


[root@cel01 ~]# flash_dom -l

Aura Firmware Update Utility, Version 1.2.7

Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved..

U.S. Government Rights - Commercial Software. Government users are subject 
to the Sun Microsystems, Inc. standard license agreement and 
applicable provisions of the FAR and its supplements.

Use is subject to license terms.

This distribution may include materials developed by third parties.

Sun, Sun Microsystems, the Sun logo, Sun StorageTek and ZFS are trademarks 
or registered trademarks of Sun Microsystems, Inc. or its subsidiaries, 
in the U.S. and other countries.



 HBA# Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC     WWID                 Serial Number

 1.  /proc/mpt/ioc0    LSI Logic SAS1068E C0     105      011b5c00     0       5080020000fe34c0     465769T+1130A405XA

        Current active firmware version is 011b5c00 (1.27.92) 
        Firmware image's version is MPTFW-01.27.92.00-IT 
        x86 BIOS image's version is MPTBIOS-6.26.00.00 (2008.10.14) 
        FCode image's version is MPT SAS FCode Version 1.00.49 (2007.09.21)


          D#  B___T  Type       Vendor   Product          Rev    Operating System Device Name 
          1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdn    [8:0:0:0] 
          2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdo    [8:0:1:0] 
          3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdp    [8:0:2:0] 
          4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdq    [8:0:3:0]

 2.  /proc/mpt/ioc1    LSI Logic SAS1068E C0     105      011b5c00     0       5080020000fe3440     465769T+1130A405X7

        Current active firmware version is 011b5c00 (1.27.92) 
        Firmware image's version is MPTFW-01.27.92.00-IT 
        x86 BIOS image's version is MPTBIOS-6.26.00.00 (2008.10.14) 
        FCode image's version is MPT SAS FCode Version 1.00.49 (2007.09.21)


          D#  B___T  Type       Vendor   Product          Rev    Operating System Device Name 
          1.  0   0  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdr    [9:0:0:0] 
          2.  0   1  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sds    [9:0:1:0] 
          3.  0   2  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdt    [9:0:2:0] 
          4.  0   3  Disk       ATA      MARVELL SD88SA02 D20Y   /dev/sdu    [9:0:3:0]
.
.

The output above has been edited for brevity. You can even have a look at the devices /proc/mpt/ioc1 on the filesystem.

We can also of course look at these devices via cellcli:


CellCLI> list physicaldisk where diskType='FlashDisk' 
         FLASH_1_0       1113M086V3      normal 
         FLASH_1_1       1113M086V4      normal 
         FLASH_1_2       1113M086V0      normal 
         FLASH_1_3       1113M086UY      normal 
         FLASH_2_0       1113M0892K      normal 
         FLASH_2_1       1113M086TR      normal 
         FLASH_2_2       1113M0891P      normal 
         FLASH_2_3       1113M0892L      normal 
         FLASH_4_0       1113M086UP      normal 
         FLASH_4_1       1113M086UQ      normal 
         FLASH_4_2       1113M086UT      normal 
         FLASH_4_3       1113M086UN      normal 
         FLASH_5_0       1113M08AGJ      normal 
         FLASH_5_1       1112M07V6U      normal 
         FLASH_5_2       1113M08AKJ      normal 
         FLASH_5_3       1113M08AH5      normal

Again presented as 4 lots of 4 and disktype of FlashDisk. Looking in on the detail of one of the flashdisks:


CellCLI>  list physicaldisk where diskType='FlashDisk' detail

  name:                   FLASH_5_3 
         diskType:               FlashDisk 
         errCmdTimeoutCount:     0 
         errHardReadCount:       0 
         errHardWriteCount:      0 
         errMediaCount:          0 
         errOtherCount:          0 
         errSeekCount:           0 
         luns:                   5_3 
         makeModel:              "MARVELL SD88SA02" 
         physicalFirmware:       D20Y 
         physicalInsertTime:     2011-12-07T19:00:02+00:00 
         physicalInterface:      sas 
         physicalSerial:         1113M08AH5 
         physicalSize:           22.8880615234375G 
         sectorRemapCount:       0 
         slotNumber:             "PCI Slot: 5; FDOM: 3" 
         status:                 normal

I’ve edited the above for just the detail on the FLASH_5_3 device, basically the last FDOM slot on the highest numbered PCI slot. You can see the size of each of the FDOMs at 22.8880615234375G which multiplied by 16 gives 366.21G.

We can also look at the lun level:

CellCLI> list lun where id='5_3' detail 
         name:                   5_3 
         cellDisk:               FD_15_cel01 
         deviceName:             /dev/sdy 
         diskType:               FlashDisk 
         id:                     5_3 
         isSystemLun:            FALSE 
         lunAutoCreate:          FALSE 
         lunSize:                22.8880615234375G 
         overProvisioning:       100.0 
         physicalDrives:         FLASH_5_3 
         status:                 normal

You can see each lun has a celldisk name associated with it, and a sensible naming convention. Finally drilling down into the celldisk detail:

CellCLI> list celldisk where name='FD_15_cel01' detail 
         name:                   FD_15_cel01 
         comment: 
         creationTime:           2012-01-10T10:13:06+00:00 
         deviceName:             /dev/sdy 
         devicePartition:        /dev/sdy 
         diskType:               FlashDisk 
         errorCount:             0 
         freeSpace:              0 
         id:                     8ddbd2c8-8446-4735-8948-d8aea5744b35 
         interleaving:           none 
         lun:                    5_3 
         size:                   22.875G 
         status:                 normal

The final point of interest on the flash cards is the white part, middle top on the card. That is the Energy Storage Module (ESM), and it has a set lifetime. According the F20 docs on a V2 it’s lifetime was expected at 3 years. You can monitor the health and lifetime of your modules with the following ipmi command:

[root@cel01 ~]# for RISER in RISER1/PCIE1 RISER1/PCIE4 RISER2/PCIE2 RISER2/PCIE5; do ipmitool sunoem cli "show /SYS/MB/$RISER/F20CARD/UPTIME"; done

Connected. Use ^D to exit. 
-> show /SYS/MB/RISER1/PCIE1/F20CARD/UPTIME

 /SYS/MB/RISER1/PCIE1/F20CARD/UPTIME 
    Targets:

    Properties: 
        type = Power Unit 
        ipmi_name = PCIE1/F20/UP 
        class = Threshold Sensor 
        value = 9844.000 Hours 
        upper_nonrecov_threshold = 26220.000 Hours 
        upper_critical_threshold = 25806.000 Hours 
        upper_noncritical_threshold = 25254.000 Hours 
        lower_noncritical_threshold = N/A 
        lower_critical_threshold = N/A 
        lower_nonrecov_threshold = N/A 
        alarm_status = cleared

    Commands: 
        cd 
        show

-> Session closed 
Disconnected

I’ve edited the output above to just one riser card, just to prevent boredom. You are looking to ensure the value , here showing value = 9844.000 Hours is less than the upper_noncritical_threshold, which in this case it is. Otherwise have the ESM replaced if this value is greater than the threshold.

So far I’ve found the flash cards on both V2 and X2-2 to be very reliable, I’d be interested in hearing other thoughts on their reliability.

Follow

Get every new post delivered to your Inbox.

Join 58 other followers