Recovering from rm -rf on a datafile

This blog posting is entirely the credit of Frits Hoogland, who pointed out to me that this was possible.

I managed to delete a datafile, that while it had been created in a completely stupid location, was very much in use. I’m sure we’ve all been there, I mean who hasn’t run rm -rf somewhere they shouldn’t have? When I subsequently checked the database and saw no flashback database, I realised this was going to mean the database being recreated. It was non-production of course, so no real big deal, but still a bit of work none the less. Thanks to Frits’ suggestion though, I could have saved myself some of that work!

First lets create a tablespace, in a really dumb location:

SQL> create tablespace temp_test datafile '/tmp/temp_test01.dbf' size 10M;

Tablespace created.

SQL> conn temp_test/temp_test
Connected.
SQL> create table test_table (key number(1));

Table created.

SQL> insert into test_table values (7);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from test_table;

       KEY
----------
         7


So we have a schema temp_test with a simple table that was created in the temp_test datafile (this was default tablespace of temp_test user). Now we are going to remove this temp_test datafile that is in the /tmp directory:

[oracle@localhost fd]$ rm -rf /tmp/temp_test01.dbf 

It’s really gone. Lets test we can’t read from that table again:

[oracle@localhost fd]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Apr 23 10:50:23 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning option

SQL> conn temp_test/temp_test
Connected.
SQL> select * from test_table;

       KEY
----------
         7

SQL> conn / as sysdba
Connected.
SQL> alter system flush buffer_cache;

System altered.

SQL> conn temp_test/temp_test
Connected.
SQL> select * from test_table;
select * from test_table
*
ERROR at line 1:
ORA-01116: error in opening database file 6
ORA-01110: data file 6: '/tmp/temp_test01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

For a simple table already buffered, you can still retrieve the data from cache, but as soon as you are forced to attempt a read from disk you are going to error. Oops! That’s not good. So what are the options? flashback database, was my first thought when this happened for real, but that was not available, backups? Well Frits pointed me out a really neat way:

First find a PID of a process we know would have a File Descriptor open for the file we have just removed. Database writer is a good candidate:

[oracle@localhost tmp]$ ps -ef|grep dbw
oracle    3281     1  0 02:19 ?        00:00:04 ora_dbw0_TEST

We could do an lsof on this PID to confirm this is the case:

[oracle@localhost fd]$ /usr/sbin/lsof -p 3281
.
.
.
oracle  3281 oracle  mem    REG  253,0    144776 9919352 /lib64/ld-2.5.so
oracle  3281 oracle  mem    REG  253,0   1722328 9919353 /lib64/libc-2.5.so
oracle  3281 oracle  mem    REG  253,0    615136 9919361 /lib64/libm-2.5.so
oracle  3281 oracle  mem    REG  253,0     22368 8109231 /usr/lib64/libnuma.so.1
oracle  3281 oracle  mem    REG  253,0     23360 9919355 /lib64/libdl-2.5.so
oracle  3281 oracle  mem    REG  253,0    145872 9919356 /lib64/libpthread-2.5.so
oracle  3281 oracle  mem    REG  253,0     53448 9919357 /lib64/librt-2.5.so
oracle  3281 oracle  mem    REG  253,0    114352 9919367 /lib64/libnsl-2.5.so
oracle  3281 oracle  mem    REG  253,0     58949 6620956 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libnque11.so
oracle  3281 oracle  mem    CHR    1,5              4172 /dev/zero
oracle  3281 oracle  mem    REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle  mem    REG  253,0     53880 9919035 /lib64/libnss_files-2.5.so
oracle  3281 oracle  mem    REG  253,0      3768 8091976 /usr/lib64/libaio.so.1.0.1
oracle  3281 oracle  mem    REG  253,0    153574 6633914 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocrutl11.so
oracle  3281 oracle  mem    REG  253,0   3319072 6633913 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocrb11.so
oracle  3281 oracle  mem    REG  253,0   1590995 6633912 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libocr11.so
oracle  3281 oracle  mem    REG  253,0     12755 6618701 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxn2.so
oracle  3281 oracle  mem    REG  253,0  17319952 6633880 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libhasgen11.so
oracle  3281 oracle  mem    REG  253,0    161764 6618402 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libdbcfg11.so
oracle  3281 oracle  mem    REG  253,0    228765 6633882 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libclsra11.so
oracle  3281 oracle  mem    REG  253,0   7955322 6634146 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libnnz11.so
oracle  3281 oracle  mem    REG  253,0   1010297 6620981 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxp11.so
oracle  3281 oracle  mem    REG  253,0    589359 6634326 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libcell11.so
oracle  3281 oracle  mem    REG  253,0     12259 6615927 /u01/app/oracle/product/11.2.0/dbhome_1/lib/libodmd11.so
oracle  3281 oracle    0r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    1w   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    2w   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    3r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    4r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    5u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle    6r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    7r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    8r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle    9r   CHR    1,3       0t0    4170 /dev/null
oracle  3281 oracle   10r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   11r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   12u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle   13r   REG  253,0   1101312 6627555 /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
oracle  3281 oracle   14r   DIR    0,3         0   14239 /proc/3281/fd
oracle  3281 oracle   15r   CHR    1,5       0t0    4172 /dev/zero
oracle  3281 oracle   16u   REG  253,0      1544 6621005 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
oracle  3281 oracle   17uR  REG  253,0        24 9919350 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkTEST
oracle  3281 oracle   18r   REG  253,0   1101312 6627555 /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
oracle  3281 oracle   19u  IPv6  87410       0t0     UDP *:19457 
oracle  3281 oracle  256u   REG  253,0   9748480 6514798 /u01/app/oracle/oradata/TEST/control01.ctl
oracle  3281 oracle  257u   REG  253,0   9748480 6514799 /u01/app/oracle/fast_recovery_area/TEST/control02.ctl
oracle  3281 oracle  258uW  REG  253,0 754982912 6514790 /u01/app/oracle/oradata/TEST/system01.dbf
oracle  3281 oracle  259uW  REG  253,0 629153792 6514791 /u01/app/oracle/oradata/TEST/sysaux01.dbf
oracle  3281 oracle  260uW  REG  253,0 120594432 6514792 /u01/app/oracle/oradata/TEST/undotbs01.dbf
oracle  3281 oracle  261uW  REG  253,0   5251072 6514793 /u01/app/oracle/oradata/TEST/users01.dbf
oracle  3281 oracle  262uW  REG  253,0 362422272 6515122 /u01/app/oracle/oradata/TEST/example01.dbf
oracle  3281 oracle  263uW  REG  253,0  30416896 6514823 /u01/app/oracle/oradata/TEST/temp01.dbf
oracle  3281 oracle  264uW  REG  253,0  10493952 5761580 /tmp/temp_test01.dbf (deleted)

Missed out lots of /dev/shm entries in the above. We can see on the last line, the /tmp/temp_test01.dbf datafile and we see it’s marked as deleted. We can also see it has a file descriptor 264. You can also see this from the following directory:

[oracle@localhost fd]$ ls -ltar /proc/3281/fd/
total 0
dr-xr-xr-x 7 oracle oinstall  0 Mar  6 13:38 ..
dr-x------ 2 oracle oinstall  0 Mar  6 13:38 .
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 9 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 8 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 7 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 6 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 5 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 3 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 263 -> /u01/app/oracle/oradata/TEST/temp01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 262 -> /u01/app/oracle/oradata/TEST/example01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 261 -> /u01/app/oracle/oradata/TEST/users01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 260 -> /u01/app/oracle/oradata/TEST/undotbs01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 259 -> /u01/app/oracle/oradata/TEST/sysaux01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 258 -> /u01/app/oracle/oradata/TEST/system01.dbf
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 257 -> /u01/app/oracle/fast_recovery_area/TEST/control02.ctl
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 256 -> /u01/app/oracle/oradata/TEST/control01.ctl
l-wx------ 1 oracle oinstall 64 Apr 23 10:45 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 19 -> socket:[87410]
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 18 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 17 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkTEST
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 16 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 15 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 14 -> /proc/3281/fd
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 13 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Apr 23 10:45 12 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_TEST.dat
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 11 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 10 -> /dev/zero
l-wx------ 1 oracle oinstall 64 Apr 23 10:45 1 -> /dev/null
lr-x------ 1 oracle oinstall 64 Apr 23 10:45 0 -> /dev/null
lrwx------ 1 oracle oinstall 64 Apr 23 10:48 264 -> /tmp/temp_test01.dbf (deleted)

Where we have used the PID of the dbw process. Again we see the file is marked as deleted and again it’s FD 264. Now this is the idea Frits mentioned and we can start actually recovering the data:

[oracle@localhost fd]$ cat 264 > /tmp/test01.dbf

From the /proc/3281/fd/ directory I ran a cat on the File Descriptor number and sent it to another filename. test01.dbf. Now you have recovered the data, you can switch the tablespace to use this datafile:

SQL> alter database datafile '/tmp/temp_test01.dbf' offline;

Database altered.

SQL> alter database rename file '/tmp/temp_test01.dbf' to '/tmp/test01.dbf';

Database altered.

SQL> alter database datafile '/tmp/test01.dbf' online;
alter database datafile '/tmp/test01.dbf' online
*
ERROR at line 1:
ORA-01113: file 6 needs media recovery
ORA-01110: data file 6: '/tmp/test01.dbf'


SQL> recover datafile '/tmp/test01.dbf';
Media recovery complete.
SQL> alter database datafile '/tmp/test01.dbf' online;

Database altered.

SQL> select * from temp_test.test_table;

       KEY
----------
         7

This is such a cool trick. I still would consider a very last option, particularly in a production environment. In fact, you don’t really want to be creating tablespaces in the /tmp directory or any other stupid location in the first place!

About these ads
Leave a comment

19 Comments

  1. Interesting one… Thanks for that
    Thomas

    Reply
  2. :-) Like I said in my tweet…it’s not deleted just a little more difficult to see… nice one.

    Reply
  3. The first thing I would do is find out who is carelessly using the -r flag to remove one file.

    Reply
    • jarneil

       /  April 25, 2013

      Hi Seth,

      Yeah, I take the point about the -r flag. Tsk, tsk!

      jason.

      Reply
  4. Thanks Jason for sharing this. :)

    Reply
  5. Hi, recovery may fail if you copy the file while it is written. You must put the tablespace in backup mode before. Or better: put the tablespace in read only so that you don’t need to offline/recover.

    Reply
  6. Venkata

     /  May 2, 2013

    Thanks for sharing…

    Reply
  7. GG

     /  May 4, 2013

    Nice artical, but have a question..If I have multiple DBW processes running for the database, which process do I need to look first? and is this scenario applicable only for file created in /mnt mount point?
    -GG

    Reply
    • GG

       /  May 4, 2013

      sorry*
      /tmp

      Reply
      • jarneil

         /  May 4, 2013

        Hi,

        With multiple dbwr you should be able to look at either for a particular instance.

        It will work for files other /tmp it is or specific to that.

  8. I don’t see how you could use flashback database to retrieve a lost datafile, even if it’s enabled? Or am I missing something?

    Reply
    • jarneil

       /  May 14, 2013

      Hi Matthias,

      You are absolutely correct! and it shows you how much of crutch I find flashback database.

      jason.

      Reply
  9. Hi,
    Nice artical, thanks for sharing…

    Reply
  10. doganay

     /  May 31, 2013

    thanks very much Jason, very useful article.
    I wrote a shell script runs infinitely using this knowledge,
    I hope it may help someone :)

    http://doganay.wordpress.com/2013/05/30/how-to-undelete-a-datafile/

    Reply
  11. I was going to make the same point as Matthias about flashback database not being of use and found the point in the online docs so I thought I might as well post

    “This command works by undoing changes made by Oracle Database to the datafiles that exist when you run the command. Flashback can fix logical failures, but not physical failures. Thus, you cannot use the command to recover from disk failures or the accidental deletion of datafiles.”

    Reply
  12. Nelson

     /  March 4, 2014

    Great post!
    However, you need to change the following line:
    create table test_table (key number(1));
    To read:
    create table test_table (key number(1)) tablespace temp_test;

    Otherwise, the table will be created in the default tablespace USERS, and people are going to believe everything works, but that is not the case, since the removed datafile will not contain the table.

    Also if the database is not in archivelog mode (mine is a developer db, so it is not), just add the word DROP to the offlining.

    Thanks again for the post!

    Reply
  1. Ever deleted a Data File From a Running Database? | Rants & Raves – The Blog! - My blogspace for rants and raves about anything I like! (or don't like!)
  2. Ever deleted a Data File From a Running Database? - Oracle - Oracle - Toad World
  3. Recovering after the loss of Redo Log files thanks to the file descriptor | bdt's oracle blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 56 other followers

%d bloggers like this: