Exadata Shellshock: IB Switches Vulnerable

Andy Colvin has the lowdown on the Oracle response and fixes for the bash shellshock vulnerability.

However, when I last looked it seemed Oracle had not discussed anything regarding the IB switches being vulnerable.

The IB switches have bash running on them and Oracle have verified the IB switches are indeed vulnerable.


[root@dm01dbadm01 ~]# ssh 10.200.131.22

root@10.200.131.22's password:

Last login: Tue Sep 30 22:46:41 2014 from dm01dbadm01.e-dba.com

You are now logged in to the root shell.

It is recommended to use ILOM shell instead of root shell.

All usage should be restricted to documented commands and documented

config files.

To view the list of documented commands, use "help" at linux prompt.

[root@dm01sw-ibb0 ~]# echo $SHELL

/bin/bash

[root@dm01sw-ibb0 ~]# rpm -qf /bin/bash

bash-3.2-21.el5

We have fixed up, as instructed by Oracle, our compute nodes and the test then shows the following once you are no longer vulnerable to the exploit:

env 'x=() { :;}; echo vulnerable' 'BASH_FUNC_x()=() { :;}; echo vulnerable' bash -c "echo test"
bash: warning: x: ignoring function definition attempt
bash: error importing function definition for `BASH_FUNC_x'
test

Note the lack of “vulnerable” in the output.

Unfortunately when we come to run on the IB switches:


[root@dm01sw-ibb0 ~]# env 'x=() { :;}; echo vulnerable' 'BASH_FUNC_x()=() { :;}; echo vulnerable' bash -c "echo test"
vulnerable
bash: BASH_FUNC_x(): line 0: syntax error near unexpected token `)'
bash: BASH_FUNC_x(): line 0: `BASH_FUNC_x() () { :;}; echo vulnerable'
bash: error importing function definition for `BASH_FUNC_x'
test
[root@dm01sw-ibb0 ~]# bash: warning: x: ignoring function definition attempt
-bash: bash:: command not found
[root@dm01sw-ibb0 ~]# bash: error importing function definition for `BASH_FUNC_x'
> test
> 

It’s vulnerable. As apparently is the iLOM. There are as yet no fixes available for either of these.

Exadata: What’s Coming

This is based on the presentation Juan Loaiza gave regarding What’s new with Exadata. While a large part of the presentation focussed on what was already available, there are quite a few interesting new features that are coming down the road.

First of was a brief mention of the hardware. I’m less excited about this. The X4 has plenty of the hardware that you could want: CPU, memory and flash. You’d expect some or all of them to be bumped in the next generation.

New Hardware

This was skated over fairly quickly, but I expect an Exadata X5 in a few months. The X4 was released back in December 2013, first X4 I saw was January 2014. I wouldn’t be surprised if Oracle release the X5 on or around the anniversary of that release.

Very little was said about the new hardware that would be in the X5 except that the development cycle has followed what intel has released, and that cpu cores have gone up and flash capacity has gone up. No word was said on what CPU is going to be used on the X5.

The compute nodes on an X4-2 have Intel E5-2697 v2 chips this is a 12 core chip running at 2.7GHz. I’d expect an increase in core count. The X3 to X4 transition increased core count by 50%. If that happens again, we get to 18 cores. There is an Intel E5-2699 v3 with 18 cores but that’s clocked at 2.3GHz.

However, I think I’d be less surprised if they went with E5-2697 v3 which is 14 core chip clocked at 2.6GHz. That would be a far more modest increase in the number of cores. The memory speed available with this chip does go up though – it’s DDR4. Might help with In Memory option. I also wonder if they’ll bump the amount of memory supported – this chip (like the predecessor) can go to 768GB.

As I said, it was not mentioned which chip was going to be used, only that Intel had released new chips and that Oracle would be qualifying their use for Exadata over the coming months.

New Exadata Software

There was a bunch of interesting sounding new features coming down the road. Some of the ones that in particular caught my eye were:

The marketing friendly term “Exafusion”. Exafusion seems to be about speeding up OLTP, labelled as “Hardware Optimized OLTP Messaging” it’s a reimplementation of cache fusion. Messages bypass network stack leading to a performance improvement.

Columnar Flash Cache – This is Exadata automatically reformatting HCC data when written to flash as a pure column store for analytic workloads. Dual formats are stored.

Database snapshots on Exadata. This seems designed with pluggable databases in mind for producing fast clones for dev/test environments. Clearly something that was a gap with ASM as used on exadata, but ACFS does snapshots.

Currently the latest Linux release available on Exadata is 5.10. Upgrading across major releases is not supported – would have required reimaging. Not a pretty prospect. Thankfully Oracle are going to allow and enable upgrading in place to 6.5.

Some talk about reducing I/O outliers both in reading from hdd and in writing to flash.

Currently with IORM you can only enable or disable access to flash for a particular database. Full IORM seems to be coming for flash.

Final new feature that caught my eye was the long rumoured Virtualisation coming to Exadata. OVM is coming. The ODA for example has had VM capability for some time, so it’s in some ways an obvious extension. I’m expecting with the increasing number of cores lots of smaller organisations may not actually need all those cores and might think even if they could turn unused ones off, it’s a waste buying that hardware and not being able to use it.

I’m hoping to NOT see OVM on an Exadata in the wild anytime soon.

Software on Silicon

One final point almost tucked out of site, was that Juan had a little bullet point about “software on silicon”. Now this has me confused. My understanding is that when Larry was talking about this, it was specifically SPARC. That I can understand as Oracle controls what goes on the chip.

Ignoring the SPARC Supercluster, there is no SPARC on Exadata. So that leaves a closer collaboration with Intel or moving to SPARC. Collaborating closer with Intel is a possibility and Oracle had first dibs on the E7-8895 v2 for the X4-8.

I can’t imagine changing the compute nodes to SPARC that wouldn’t make sense. But “software on silicon” is a bit like offloading…

Exadata software is definitely keeping moving forward and the difference between running Oracle on Exadata compared with non-exadata is growing ever wider with each “exadata only” feature.

Free Oracle event with beer: London 3/7/2014

A quick note to drum up interest in a free Oracle event being hosted by e-dba in London on the evening of 3rd July 2014.

The aim is to have a series of informal get togethers over beer, pizza and some good Oracle speakers.

The theme of the first event is based around upgrading.

We have Jonathan Lewis presenting on general principles and strategies for upgrading the RDBMS.

I will be doing a couple of live demos including an in place 12c upgrade in 15 minutes and hopefully showing how to migrate from an 11gR2 database to a new 12c pluggable database with a full transportable datapump import – a new migration method for 12c.

Not to mention the discussion on Swingbench and SLOB.

Did I mention the free beer?

Hope to see you there.

Oracle ASM 12c: New Features

Last week I was lucky enough to be presenting at the UKOUG AIM SIG. There was a decent enough crowd in attendance and there were some really interesting talks and some really good speakers. In particularly I found Chris Lawless speaking on replication a particularly engaging speaker, and Dave Webster really held the audiences attention late in day.

I was giving a presentation on the new features available to you with 12c ASM. The presentation is below. What you don’t get from the ppt, though is the various demos I did, and in particular seeing flex ASM in action on my 4-node 12c RAC demo cluster.

I should confess, the above isn’t quite what I presented as I did pictures instead of text for the new features.

For clearest understanding, you probably want to download the ppt and actually read the notes attached to each slide.

Thoughts on UKOUG tech 13

So the annual UKOUG technology conference has come and gone for yet another year. This time it was in a new location, having moved away from its regular berth in Birmingham. Manchester is not a city I’m that familiar with, the only previous time I had been there was a trip with my wife, and she drove me to a near death experience with an oncoming tram.

Thankfully this time the only near death experiences for me this year was laptop failure during a demo, though I did hear of some folks getting into a scrape or two. 😉

 

Yes, there were less people about than previous years, but that can be explained by the Apps folks having split off to their own conference. The venue was I thought pretty reasonable, while some rooms were pretty small, as lots of people commented, better presenting in a full small room than a fairly empty large room. So for me, the venue worked fine. Manchester is likely to be harder for more people to get to than Birmingham though, and I will be very interested which city the conference will be in next year, I understand it will be moving again.

So onto my experience at the conference. Certainly, my conference experience has changed a lot over the past few years, and the meeting people aspect has come more to the fore. I had a strange conference in terms of sessions I attended: I barely attended a database session! It was all storage/IO or operating systems. In some ways, you could look at this as a bit alarming, as I’m not sure the value add is necessarily at that level of the stack, however for me it’s really where my career has been for most of the time.

The quality of the presentations I did see was outstanding, and that for me is a critical thing of the UKOUG Annual Conference: the quality of the presentations. I don’t think I saw a bad one. The highlights for me where the following three though:

Luca Canali: Storage Latency for Oracle DBAs:

Luca’s work is just awesome, and he has developed some latency visualisations tools, which were very interesting to see.

RAC and GI Round Table:

Round tables can either be really eye-opening or fairly uninteresting. It all depends on who is contributing what. This roundtable was in the eye-opening camp. Joel Goodman chaired this excellently and it was great to see the only 12c Grid Infrastructure implementation in the room was one done by e-dba and my colleague Svetoslav Gyurov.

Alex Gorbachev: Hadoop for Oracle DBAs:

This was actually given at Oak Table World running alongside the UKOUG conference. This was an excellent, eye-opening presentation on hadoop.

The other thing that is important to me at conferences, compared to say 6-7 years ago is presenting. I’m not sure I’d like to go to UKOUG and not present. Sadly, this year due to other commitments from colleagues, I ended up picking up 4 presentations. This is too much for me to focus on at once. Thankfully though, 3 of them were on Exadata, and one of those a panel discussion. But 4 is quite a stressful amount.

So, some thoughts on my presentations.

Exadata Consolidations and Migrations: Getting it Right Up Front

This is a presentation I have delivered quite a few times and am very familiar with the content, thankfully this went well and I was reasonably happy with it in the delivery, and I was able to add some stuff on the (now) newly available Exadata X4’s to keep it fresh! You can grab this presentation if you are interested. Be aware there is lots of text to read in the notes field, even if the slide is fairly minimal.

Next was 2 on Tuesday:

Turbo-Charged Analytics from Exalytics with Exadata – The Reality Behind The Hype

I stepped in at the last minute for this one, and did 20 minutes on Exadata. I did not like this presentation – it was too much of a cut ‘n’ shut job. One attendee complained afterwards that there was not any content on actually using Exalytics with Exadata.

I couldn’t fail to agree.

Linux Containers

So, I had not heard of Linux Containers until about 2 weeks prior to this presentation, but it turned out this was the one I had most been looking forward to. I’d done the most work leading up to UKOUG on this presentation as I had to learn it from scratch in the 2 weeks (as well as the other presentations, and the day-to-day job!). This was meant to show what Containers were, why you might use them and then demo them in action. I really thought the slides looked cracking (someone please take me aside and have a word if I’m out to lunch on this), and though I was hesitant about the content at times, I thought I’d almost pulled it off.

Then the demos, which at first were going fine, started to kill my laptop. and I mean kill. Not just the demos not working, but the laptop being totally utterly unresponsive, so I could not even get back to powerpoint from my VM. It was horrible. The presentation, though about 80% done, just came to an abrupt car crash of an ending. It took about 15 minutes after this to even power of my laptop!

As Tom Kyte put it the next day:

demos

Image courtesy of Marc Fielding.

My final “presentation” was on Wednesday, but this time it was a panel discussion with Andy Colvin and Frits Hoogland.

Real World Exadata Configuration Tips

Both Andy and Frits are awesome and it was a privilege and a pleasure to share the stage with them. We had an excellent discussion and lots of interaction with the audience which I think made it a really worthwhile hour.

I’d love to do something like that again!

panel

Again, image courtesy of Marc Fielding.

It was a great conference, I really had a great time, and I feel lucky and indeed proud to be part of a great Oracle community and to know so many outstanding individuals.

VritualBox 4.30 on OS X 10.9 Mavericks

I’ve been using OS X for a very long time, and one of the Applications that is invaluable in my day to day work is VirtualBox. It’s a great application allowing you to work with various Oracle versions and products, and I have quite a few linux VMs with differing DB versions inside them.

I’m also a bit rash when it comes to upgrading my OS X version, and with OS X Mavericks released on the 22nd of October and best of all FREE, it was a hard combination to resist. Well, resist I did for all of a morning! As I had recently upgraded to VirtualBox 4.30 I was quietly confident there would be no issues….

Oh boy, I often get into trouble with my sunny optimism.

The first issue you might hit with VirtualBox 4.30 on OS X 10.9, could be a “kernel driver not loaded” has actually been well covered in this forum posting. The script works quite well at loading drivers.

However, when I did this, I was still getting this error pop up, and no VMs starting:

vbox

This was not good. Now, I then reinstalled VirtualBox 4.30 and voila I could start a VM again. Happy days. Then, as this was on my desktop, I close it down for the evening, and come back the next day. Same issue as above, no VM’s will start. Grrrr. Once again a reinstall fixes the issue.

I knew something was getting loaded differently at boot time, than when VirtualBox was freshly installed. The forum posting above gave me a clue as to what it might be. So it was kextstat to the rescue!

kexstat will show you what kernel extensions you have running. Just like the linux lsmod command.

So I ran kextstat after a fresh a reboot:

jarneil> kexstat
.
.
118    3 0xffffff7f8226b000 0x43000    0x43000    org.virtualbox.kext.VBoxDrv (4.2.18)
119    0 0xffffff7f822ae000 0x5000     0x5000     org.virtualbox.kext.VBoxNetFlt (4.2.18)
120    0 0xffffff7f822b3000 0x6000     0x6000     org.virtualbox.kext.VBoxNetAdp (4.2.18)
121    0 0xffffff7f822b9000 0x8000     0x8000     org.virtualbox.kext.VBoxUSB (4.2.18)
.
.

So among a whole bunch of other kernel extensions there were the 4 VirtualBox modules and all loaded from the previous version I had been running before upgrading to 4.30 which was *days* before I had upgraded to OS X 10.9!

Now it was just a case of making sure this lot did not load again, in fact they clearly were not removed properly upon upgrading from 4.2.18 to 4.30.

I used locate to find out these 4.2.18 kext were being loaded from /Library/Extensions there was a directory for each kext above in there. If you read the Info.plist file for them it was clear these were the 4.2.18 versions.

After doing rm -rf on these old VirtualBox directories in the /Library/Extensions directory I can now happily reboot and *still* have my VM’s working! 😀

OpenWorld 2013

I’ve just come back from my 5th trip to Oracle OpenWorld. While there is something very special about your first trip to OpenWorld, this has been my favourite trip of the lot. In previous years I’ve blogged about what’s been going on every day, but for me, twitter and @jarneil is where I tend to be more active these days, but I thought I’d give a general overview of my OpenWorld experiences this year.

I was quite lucky this year to get a presentation selected “Exadata From Installation to Go Live: Experiences From Banking And Retail”. I was delighted by the number of attendees who turned up to it, well over 150 I reckon. while I have presented many times at various UKOUG events over the year, but I feel it’s slightly different presenting in the US, the attendees were very keen to ask questions and many came up to chat at the end. It certainly makes it all feel worthwhile as a presenter, and I’ll be submitting a presentation for next year for sure.

OakTable World

OakTable World has really become a fixed staple of my OpenWorld experience, a conference within a conference, where some of the most knowledgeable folks in the Oracle community present in a bit more technical detail than you get in the general conference. It really was tremendously well organised by @kylehailey and it was a real privelege to be able to attend this.

Announcements

The big announcement in the Larry Ellison keynote was of course the new Oracle database in memory option. While it sounds like this is a long way from being production ready, it does sound like potentially an awesome feature. It did however make me wonder whether it might start to cannibalize Exadata sales: why worry if you don’t have smart scans when you are querying your hot data in memory.

On the Exadata front, the conference was a bit of damp squib, there was no new hardware announced, apparently due to the timeframe of the recent Intel chip release – it was too late for Oracle. Also there was no new Exadata Storage Server Software announced either. This really did take me by surprise the current version has been around for 9 months or so, and I really expected an update. Now, there were sneak peaks regarding what is coming and there is some good stuff in the offing, in particular compression of data in flashcache, done in hardware on the card, should essentially ensure you get more bang for your flashcache. There is also QOS for networks

To ensure offloading occurs for different versions of databases running on the compute nodes there is going to be a bit of an architectural change with the software running on the cells, with essentially an offloading process for the different database versions. It sounds like this may take a while to appear, and it’s a bit disappointing to say the least, that 12c has been released with no offloading support.

Interesting Presentations

I thought I’d mention a few of my favourite presentations, in addition to the OakTable World stuff above, that I saw. First up is from Maria Colgan and Jonathan Lewis called optimizer bootcamp. It not only was highly educational, but was also superbly entertaining – definitely the outstanding presentation of the week. I also enjoyed “Oracle Exadata: What’s New, What’s Coming” by Juan Loaiza. Juan is an excellent presenter and comes across as highly knowledgeable about the product. Finally I’d like to mention “Solving the Toughest Oracle Exadata Problems by Using Tips from Oracle Development” mostly presented by Scott Heisey. This really did give you a lot of tips on where to look and what to do when something goes wrong with your exadata. If the slides for this become available, I’d highly recommend taking a look if you have to manage an Exadata box.

The Social Side

What truly made this year a memorable experience though, was the social side. After 5 years, I now know quite a few people and really it is superb being able to talk to some of the smartest folks in the Oracle community. It is this that really made it a great year: from the Oracle Ace Dinner and speaking to a whole bunch of Enkitec folks, to the pythian party and one of the best stories I heard at OpenWorld from Karl Arao, to an e-dba dinner and some great stories about Oracle UK from James Anthony and finally the great blogger meetup and speaking with Luca Canali, oh and not forgetting dinner at the Franciscan Crab to the finale with Martin Nash in the W and a couple of mojhito.

It really is the people that makes Oracle OpenWorld a great conference to attend! So many thanks to @e_dba and the Oracle Ace program, without their support, I could not attend.

Cloning Oracle 12c Pluggable Databases

The wait is over and Oracle 12c has finally arrived. Of course there is a lot of interesting and useful looking new features, but one of the eye-catchers is the major architectural shift to multi-tenancy with the Container Database and Pluggable Database features.

To actually create a container database with pluggable databases is really straightforward as shown here by Matthew Walden.  I must say I’m liking the new dbca, it feels just that little bit incrementally nicer to use.

Of course, one of the big ideas behind the multi-tenancy idea is to ramp up the amount of consolidation you can squeeze out of a given server. Using a multi-tenant database you can consolidate more databases onto a given amount of server memory and cpu. I’m doing a lot of that on Exadata at the moment, and I can see that it’s going to be really useful in certain scenarios for this. However, I’m not expecting 12c Exadata support anytime soon!

The other big driver for multi-tenancy is in manageability – the abstraction of user data from the oracle supplied metadata enables a lot of possibilities that did not exist previously. What I wanted to show in this post was how easy it is to create a clone of a pluggable database.

I’ve worked a lot with developers in the past, and one of the things I learned during this time is that developers can’t get enough copies of the database they working on. I don’t think this is a bad thing. In fact, I always took it as read that as the DBA it was my role to actually help the developer develop the best code they could. Certainly makes for less painful releases. So, I was all in favour of giving developers the tools and resources they needed to develop good code, and this very much included copies of the databases they were working on. Testing is a great thing!


[oracle@localhost orcl]$ sq

SQL*Plus: Release 12.1.0.1.0 Production on Wed Jul 3 09:05:10 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select CON_ID, NAME, OPEN_MODE from V$PDBS;

    CON_ID NAME                           OPEN_MODE
---------- ------------------------------ ----------
         2 PDB$SEED                       READ ONLY
         3 PDBORCL                        READ ONLY

So I’ve connected to my container database and I’m looking at what pluggable databases exist already. It just happens I have 2 at the moment. We see that both these happen to be read-only. We have the SEED database which is always present in a container database, and we currently have 1 pluggable database PDBORCL. That database just happens to be read only at the moment, but this is handy because:

Currently you can only create a clone of a pluggable database if that pluggable database is read only. This to me seems a bit of a limitation. I expect in a future release this limitation will be lifted.

However, apart from that I can’t emphasise enough how simple and trivial it is to create a clone of a pluggable database. It’s a simple one command:

SQL> create pluggable database PDB1 from PDBORCL FILE_NAME_CONVERT= ('/oradata/orcl/pdborcl', '/oradata/orcl/pdb1');

Pluggable database created.

Elapsed: 00:00:10.67

Simple. This clone took 10 secs. The size of the database that was being cloned was 1.3G. The create pluggable database command has a lot options, but this was the simplest syntax I could see to create a clone of a pluggable database. You need the file_name_convert so that the datafiles from the pluggable database you are cloning can be put somewhere else. And this is something to bear in mind. It really is copying the datafiles from the pluggable database you are cloning. That could be a lot of space required.



SQL> select CON_ID, NAME, OPEN_MODE from V$PDBS;

    CON_ID NAME                           OPEN_MODE
---------- ------------------------------ ----------
         2 PDB$SEED                       READ ONLY
         3 PDBORCL                        READ ONLY
         4 PDB1                           MOUNTED

Elapsed: 00:00:00.00

We can see the pluggable database PDB1 above is created in the mounted state, but it’s easy to get it opened:


SQL> alter pluggable database PDB1 open;

Pluggable database altered.

Elapsed: 00:00:03.88
SQL> select CON_ID, NAME, OPEN_MODE from V$PDBS;

    CON_ID NAME                           OPEN_MODE
---------- ------------------------------ ----------
         2 PDB$SEED                       READ ONLY
         3 PDBORCL                        READ ONLY
         4 PDB1                           READ WRITE

Elapsed: 00:00:00.01

How I can see this being used for developers is you have cloned your production environment onto your test environment. All masking etc being done. You have this copy as your “Master Developer Copy” and make it read only, from this you can create as many copies for your developers as you wish. If a developer makes a mistake it’s easy to drop the pluggable database and create another one:

SQL> drop pluggable database PDB1 including datafiles;
drop pluggable database PDB1 including datafiles
*
ERROR at line 1:
ORA-65025: Pluggable database PDB1 is not closed on all instances.


Elapsed: 00:00:00.05

SQL> alter pluggable database PDB1 close;

Pluggable database altered.

Elapsed: 00:00:00.31

SQL> drop pluggable database PDB1 including datafiles;

Pluggable database dropped.

Elapsed: 00:00:00.15

So we can see we have to have the pluggable database closed before we can drop it. To actually drop it took hardly any time at all, subsecond!

As I eluded to earlier on of the big drawbacks of using this method is that the space requirements can become vast if you start making lots of clones of large databases. What I’ve used in the past is using snapshots to drastically reduce the storage requirements when creating 10’s of copies of the same database. You essentially store 1 copy of the full datafiles and deltas of changes for each of the copies.

It’s great to see there is a snapshot option on the create pluggable database command! However you’ve got to be on the right storage ACFS or Direct NFS. I’m looking forward to testing this soon.

The new multi-tenancy architecture is a major shift for the Oracle database, and this just one of them.

Recovering from rm -rf on a datafile

This blog posting is entirely the credit of Frits Hoogland, who pointed out to me that this was possible.

I managed to delete a datafile, that while it had been created in a completely stupid location, was very much in use. I’m sure we’ve all been there, I mean who hasn’t run rm -rf somewhere they shouldn’t have? When I subsequently checked the database and saw no flashback database, I realised this was going to mean the database being recreated. It was non-production of course, so no real big deal, but still a bit of work none the less. Thanks to Frits’ suggestion though, I could have saved myself some of that work!

First lets create a tablespace, in a really dumb location:

SQL> create tablespace temp_test datafile '/tmp/temp_test01.dbf' size 10M;

Tablespace created.

SQL> conn temp_test/temp_test
Connected.
SQL> create table test_table (key number(1));

Table created.

SQL> insert into test_table values (7);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from test_table;

       KEY
----------
         7


So we have a schema temp_test with a simple table that was created in the temp_test datafile (this was default tablespace of temp_test user). Now we are going to remove this temp_test datafile that is in the /tmp directory:

[oracle@localhost fd]$ rm -rf /tmp/temp_test01.dbf 

It’s really gone. Lets test we can’t read from that table again:

[oracle@localhost fd]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Apr 23 10:50:23 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning option

SQL> conn temp_test/temp_test
Connected.
SQL> select * from test_table;

       KEY
----------
         7

SQL> conn / as sysdba
Connected.
SQL> alter system flush buffer_cache;

System altered.

SQL> conn temp_test/temp_test
Connected.
SQL> select * from test_table;
select * from test_table
*
ERROR at line 1:
ORA-01116: error in opening database file 6
ORA-01110: data file 6: '/tmp/temp_test01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

For a simple table already buffered, you can still retrieve the data from cache, but as soon as you are forced to attempt a read from disk you are going to error. Oops! That’s not good. So what are the options? flashback database, was my first thought when this happened for real, but that was not available, backups? Well Frits pointed me out a really neat way:

First find a PID of a process we know would have a File Descriptor open for the file we have just removed. Database writer is a good candidate:

[oracle@localhost tmp]$ ps -ef|grep dbw
oracle    3281     1  0 02:19 ?        00:00:04 ora_dbw0_TEST

We could do an lsof on this PID to confirm this is the case:

[oracle@localhost fd]$ /usr/sbin/lsof -p 3281
.
.
.
oracle  3281 oracle  mem    REG  253,0    144776 9919352 /lib64/ld-2.5.so
oracle  3281 oracle  mem    REG  253,0   1722328 9919353 /lib64/libc-2.5.so
oracle  3281 oracle  mem    REG  253,0    615136 9919361 /lib64/libm-2.5.so
oracle  3281 oracle  mem    REG  253,0     22368 8109231 /usr/lib64/libnuma.so.1
oracle  3281 oracle  mem    REG  253,0     23360 9919355 /lib64/libdl-2.5.so
oracle  3281 oracle  mem    REG  253,0    145872 9919356 /lib64/libpthread-2.5.so
oracle  3281 oracle  mem    REG  253,0     53448 9919357 /lib64/librt-2.5.so
oracle  3281 oracle  mem    REG  253,0    114352 9919367 /lib64/libnsl-2.5.so
oracle  3281 oracle  mem    REG  253,0     58949 6620956 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libnque11.so
oracle  3281 oracle  mem    CHR    1,5              4172 /dev/zero
oracle  3281 oracle  mem    REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle  mem    REG  253,0     53880 9919035 /lib64/libnss_files-2.5.so
oracle  3281 oracle  mem    REG  253,0      3768 8091976 /usr/lib64/libaio.so.1.0.1
oracle  3281 oracle  mem    REG  253,0    153574 6633914 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocrutl11.so
oracle  3281 oracle  mem    REG  253,0   3319072 6633913 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocrb11.so
oracle  3281 oracle  mem    REG  253,0   1590995 6633912 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocr11.so
oracle  3281 oracle  mem    REG  253,0     12755 6618701 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxn2.so
oracle  3281 oracle  mem    REG  253,0  17319952 6633880 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libhasgen11.so
oracle  3281 oracle  mem    REG  253,0    161764 6618402 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libdbcfg11.so
oracle  3281 oracle  mem    REG  253,0    228765 6633882 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libclsra11.so
oracle  3281 oracle  mem    REG  253,0   7955322 6634146 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libnnz11.so
oracle  3281 oracle  mem    REG  253,0   1010297 6620981 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxp11.so
oracle  3281 oracle  mem    REG  253,0    589359 6634326 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libcell11.so
oracle  3281 oracle  mem    REG  253,0     12259 6615927 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libodmd11.so
oracle  3281 oracle    0r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    1w   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    2w   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    3r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    4r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    5u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle    6r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    7r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    8r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    9r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle   10r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   11r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   12u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle   13r   REG  253,0   1101312 6627555 /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
oracle  3281 oracle   14r   DIR    0,3         0   14239 /proc/3281/fd
oracle  3281 oracle   15r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   16u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle   17uR  REG  253,0        24 9919350 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkTEST
oracle  3281 oracle   18r   REG  253,0   1101312 6627555 /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
oracle  3281 oracle   19u  IPv6  87410       0t0     UDP *:19457 
oracle  3281 oracle  256u   REG  253,0   9748480 6514798 /u01/app/oracle/oradata/TEST/control01.ctl
oracle  3281 oracle  257u   REG  253,0   9748480 6514799 /u01/app/oracle/fast_recovery_area/TEST/control02.ctl
oracle  3281 oracle  258uW  REG  253,0 754982912 6514790 /u01/app/oracle/oradata/TEST/system01.dbf
oracle  3281 oracle  259uW  REG  253,0 629153792 6514791 /u01/app/oracle/oradata/TEST/sysaux01.dbf
oracle  3281 oracle  260uW  REG  253,0 120594432 6514792 /u01/app/oracle/oradata/TEST/undotbs01.dbf
oracle  3281 oracle  261uW  REG  253,0   5251072 6514793 /u01/app/oracle/oradata/TEST/users01.dbf
oracle  3281 oracle  262uW  REG  253,0 362422272 6515122 /u01/app/oracle/oradata/TEST/example01.dbf
oracle  3281 oracle  263uW  REG  253,0  30416896 6514823 /u01/app/oracle/oradata/TEST/temp01.dbf
oracle  3281 oracle  264uW  REG  253,0  10493952 5761580 /tmp/temp_test01.dbf (deleted)

Missed out lots of /dev/shm entries in the above. We can see on the last line, the /tmp/temp_test01.dbf datafile and we see it’s marked as deleted. We can also see it has a file descriptor 264. You can also see this from the following directory:

[oracle@localhost fd]$ ls -ltar /proc/3281/fd/
total 0
dr-xr-xr-x 7 oracle oinstall  0 Mar  6 13:38 ..
dr-x------ 2 oracle oinstall  0 Mar  6 13:38 .
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 9 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 8 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 7 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 6 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 5 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 3 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 263 -> /u01/app/oracle/oradata/TEST/temp01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 262 -> /u01/app/oracle/oradata/TEST/example01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 261 -> /u01/app/oracle/oradata/TEST/users01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 260 -> /u01/app/oracle/oradata/TEST/undotbs01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 259 -> /u01/app/oracle/oradata/TEST/sysaux01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 258 -> /u01/app/oracle/oradata/TEST/system01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 257 -> /u01/app/oracle/fast_recovery_area/TEST/control02.ctl
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 256 -> /u01/app/oracle/oradata/TEST/control01.ctl
l-wx------ 1 oracle oinstall 64 Apr 23 10:45 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 19 -> socket:[87410]
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 18 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 17 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkTEST
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 16 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 15 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 14 -> /proc/3281/fd
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 13 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 12 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 11 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 10 -> /dev/zero
l-wx------ 1 oracle oinstall 64 Apr 23 10:45 1 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 0 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:48 264 -> /tmp/temp_test01.dbf (deleted)

Where we have used the PID of the dbw process. Again we see the file is marked as deleted and again it’s FD 264. Now this is the idea Frits mentioned and we can start actually recovering the data:

[oracle@localhost fd]$ cat 264 > /tmp/test01.dbf

From the /proc/3281/fd/ directory I ran a cat on the File Descriptor number and sent it to another filename. test01.dbf. Now you have recovered the data, you can switch the tablespace to use this datafile:

SQL> alter database datafile '/tmp/temp_test01.dbf' offline;

Database altered.

SQL> alter database rename file '/tmp/temp_test01.dbf' to '/tmp/test01.dbf';

Database altered.

SQL> alter database datafile '/tmp/test01.dbf' online;
alter database datafile '/tmp/test01.dbf' online
*
ERROR at line 1:
ORA-01113: file 6 needs media recovery
ORA-01110: data file 6: '/tmp/test01.dbf'


SQL> recover datafile '/tmp/test01.dbf';
Media recovery complete.
SQL> alter database datafile '/tmp/test01.dbf' online;

Database altered.

SQL> select * from temp_test.test_table;

       KEY
----------
         7

This is such a cool trick. I still would consider a very last option, particularly in a production environment. In fact, you don’t really want to be creating tablespaces in the /tmp directory or any other stupid location in the first place!

Online patching: The Good, the Bad, and the Ugly

I’ve worked on 24×7 systems for more than a decade, and I have a real dislike of downtime. For one, it can be a real pain to agree any downtime with the business, and while RAC can and does help when you do work in a rolling fashion, there is still risk.

The promise of online patching has been a long one, and it is only recently that I dipped my toe in the water with them. Unfortunately, they are not a panacea, and in this blog posting I’m going to share some of the downsides.

Of course not all patches are online, if they are the README associated with the patch will have an online section in how to apply the patch, also when you uncompress the patch there will be an directory called online.

The Good

So first for the good side, the actual application truly can be done online, in that sense it does what it says on the tin. Here I’m running from the unzipped patch directory, and in this example I’m using patch 10219624:

bash-3.2$ /u01/app/oracle/product/11.2.0/db_1/OPatch/opatch apply online -connectString TESTRAC1 -ocmrf /tmp/ocm.rsp 
Oracle Interim Patch Installer version 11.2.0.3.3
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.3
OUI version       : 11.2.0.3.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/10219624_Jan_24_2013_08_54_08/apply2013-01-24_08-54-08AM_1.log

Applying interim patch '10219624' to OH '/u01/app/oracle/product/11.2.0/db_1'
Verifying environment and performing prerequisite checks...
All checks passed.
Backing up files...

Patching component oracle.rdbms, 11.2.0.3.0...
Installing and enabling the online patch 'bug10219624.pch', on database 'TESTRAC1'.


Verifying the update...

Patching in all-node mode.

Updating nodes 'rac2' 
   Apply-related files are:
     FP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt"
     DP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt"
     MP = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/make_cmds.txt"
     RC = "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/remote_cmds.txt"

Instantiating the file "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_files.txt" with actual path.
Propagating files to remote nodes...
Instantiating the file "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt.instantiated" by replacing $ORACLE_HOME in "/u01/app/oracle/product/11.2.0/db_1/.patch_storage/10219624_Dec_20_2012_02_13_54/rac/copy_dirs.txt" with actual path.
Propagating directories to remote nodes...
Patch 10219624 successfully applied
Log file location: /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/10219624_Jan_24_2013_08_54_08/apply2013-01-24_08-54-08AM_1.log

OPatch succeeded.

I’m applying this to a 2 node 11gR2 RAC cluster. You’ll notice that it is applied on ALL nodes. You can’t apply an online patch in RAC to just one node at a time and you can’t rollback one node at a time either. Also be aware that while the patch is in the oracle home on all nodes in the cluster, it’s only been applied to the local instance

Now, I know you are meant to give connection string details like username/password, and can then apply to all instances in a cluster at the same time, but on some systems I work on, I do not have this information, and rely on OS authentication only. This can lead to some pain.

You can tell a patch is applied with the following:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  ENABLED

However, on the remote node:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
No patches currently installed

I accept this need not arise if you are able to authenticate properly at installation time. To fix this up you can do the following:

-bash-3.2$ /u01/app/oracle/product/11.2.0/db_1/OPatch/opatch util enableonlinepatch -connectString TESTRAC2 -id 10219624
Oracle Interim Patch Installer version 11.2.0.3.3
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.3
OUI version       : 11.2.0.3.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2013-01-24_09-47-08AM_1.log

Invoking utility "enableonlinepatch"
Installing and enabling the online patch 'bug10219624.pch', on database 'TESTRAC2'.


OPatch succeeded.

The Bad
I’ve found rolling back to be slightly more problematic on the remote with o/s authentication. The rollback always removed the patch from the home across all nodes and always removed it from the instance on the local node. While there is an opatch method documented to then stop being enabled in an instance, very similar to the enableonlinepatch above (it’s Disableonlinepatch) I found it did not work with some patches, though opatch reported success, the patch was still enabled.

Another point to note, restarting an instance does not remove an online applied patch, there is a directory under the $ORACLE_HOME, called hpatch that has the online applied patch libraries.

To get round this I had to resort to the following oradebug commands:

SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  ENABLED

SQL> oradebug patch disable bug10219624.pch
Statement processed.
SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  DISABLED

SQL> oradebug patch remove bug10219624.pch
Statement processed.
SQL> oradebug patch list

Patch File Name                                   State
================                                =========
bug10219624.pch                                  REMOVED

That oradebug patch list showing removed then reverts to “No patches currently installed” upon instance restart.

The Ugly

This really caught me out, patches applied online are completely incompatible with a subsequent running of opatch auto. I had the situation recently whereby I had applied a patch online and then later wanted to run opatch auto to apply further patches. Before running opatch auto I always run the check for conflicts, and this did not give me a clue that opatch auto would not work with the online applied patch.

However when I ran opatch auto on Bundle Patch 11 the following occurred:

[Jan 16, 2013 9:19:16 AM]    OUI-67303:
                             Patches [   14632268   12880299   13734832 ] will be rolled back.
[Jan 16, 2013 9:19:16 AM]    Do you want to proceed? [y|n]
[Jan 16, 2013 9:19:19 AM]    Y (auto-answered by -silent)
[Jan 16, 2013 9:19:19 AM]    User Responded with: Y
[Jan 16, 2013 9:19:19 AM]    OPatch continues with these patches:   14474780
[Jan 16, 2013 9:19:19 AM]    OUI-67073:UtilSession failed:
                             OPatch cannot roll back an online patch while applying a regular patch.
                             Please rollback the online patch(es) " 14632268" manually, and then apply the regular patch(es) " 14474780".
[Jan 16, 2013 9:19:19 AM]    --------------------------------------------------------------------------------
[Jan 16, 2013 9:19:19 AM]    The following warnings have occurred during OPatch execution:
[Jan 16, 2013 9:19:19 AM]    1) OUI-67303:
                             Patches [   14632268   12880299   13734832 ] will be rolled back.
[Jan 16, 2013 9:19:19 AM]    --------------------------------------------------------------------------------
[Jan 16, 2013 9:19:19 AM]    Finishing UtilSession at Wed Jan 16 09:19:19 GMT 2013
[Jan 16, 2013 9:19:19 AM]    Log file location: /u01/app/ora/product/11.2.0.3/db_1/cfgtoollogs/opatch/opatch2013-01-16_09-19-08AM_1.log
[Jan 16, 2013 9:19:19 AM]    Stack Description: java.lang.RuntimeException:
                             OPatch cannot roll back an online patch while applying a regular patch.
                             Please rollback the online patch(es) " 14632268" manually, and then apply the regular patch(es) " 14474780"

Yes, it’s not that difficult to fix up, the frustrating thing here is the prerequisite checks did not show any issues. It’s pretty clear that the opatch auto developers have not given any thought how to properly handle an online applied patch, or the online patching developers have not considered the consequences of online patching with a future opatch auto.

Online patching is almost like the holy grail, nobody wants downtime, but I just don’t think the current online patching technique is quite fully there yet, and it really doesn’t play at all with opatch auto.