Table of contents
Preamble
Have you ever sweat when having a restore/recover to perform ? If you are a production DBA surely yes, this is always a tricky situation and having something to support you during this phase is more than welcome. Oracle 11g brings a new tool for DBA called Data Recovery Advisor (DRA).
The tool is available in Database Control/Grid Control as well as command line. I will mainly use command line as who can do more can do less…
Data Recovery Advisor that can be seconded by health monitor to have proactive tests. It could avoid users calling you to notify the error which is comfortable for ones aiming to do things pro actively…
Test has be done on Red Hat Enterprise Linux Server release 5.5 (Tikanga) and Oracle database 11.2.0.2.0.
Missing datafile
To test deletion of a datafile let’s first create a test table and tablespace with something like:
SQL> CREATE smallfile TABLESPACE testtbs datafile '/oracle/data01/test/testtbs01.dbf' SIZE 100m reuse logging extent management local SEGMENT SPACE management auto; TABLESPACE created. SQL> CREATE TABLE yjaquier.test1 TABLESPACE testtbs AS SELECT object_id FROM dba_objects; TABLE created. SQL> SELECT * FROM yjaquier.test1 WHERE rownum<=5; OBJECT_ID ---------- 20 46 28 15 29 |
Then perform a full database backup with RMAN following script:
run { backup database include current controlfile tag 'FULL_BACKUP'; backup archivelog all delete all input tag 'FULL_BACKUP'; } |
Delete the datafile behind the TESTTBS tablespace we just created:
[oracle@server1 ~]# rm /oracle/data01/test/testtbs01.dbf |
And select your test table:
SQL> SELECT COUNT(*) FROM yjaquier.test1; COUNT(*) ---------- 60820 SQL> ALTER SYSTEM flush buffer_cache; SYSTEM altered. SQL> SELECT COUNT(*) FROM yjaquier.test1; SELECT COUNT(*) FROM yjaquier.test1 * ERROR AT line 1: ORA-01116: error IN opening DATABASE FILE 6 ORA-01110: data FILE 6: '/oracle/data01/test/testtbs01.dbf' ORA-27041: unable TO OPEN FILE Linux-x86_64 Error: 2: No such FILE OR DIRECTORY Additional information: 3 |
Remark:
You may still not see it at Oracle level but if you check alert log file you will see that Oracle with reactive health monitor has detected the error:
2011-07-22 10:20:03.754000 +02:00 Checker run found 1 new persistent data failures 2011-07-22 10:21:59.685000 +02:00 Errors in file /oracle/dump/test/diag/rdbms/test/test/trace/test_m000_9996.trc: ORA-01116: error in opening database file 6 ORA-01110: data file 6: '/oracle/data01/test/testtbs01.dbf' ORA-27041: unable to open file Linux-x86_64 Error: 2: No such file or directory Additional information: 3 |
Remark:
The flush of buffer cache may help you to see the error in your SQL session but nothing guaranteed. On Unix/Linux when you use a file the OS allocates inodes (stored in inode tables) and does not directly link to filename. Means that when you delete a file, inodes are still there and only the link to that file is broken. You can then recover data if file is still open using pseudo filesystem /proc (see below).
If you go to RMAN you see the failure listed:
RMAN> list failure; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- -------------------- ------- 67327 HIGH OPEN 22-jul-2011 10:20:03 One or more non-system datafiles are missing |
Ask RMAN to suggest you what to do with:
RMAN> advise failure; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- -------------------- ------- 67327 HIGH OPEN 22-jul-2011 10:20:03 One or more non-system datafiles are missing analyzing automatic repair options; this may take some time using channel ORA_DISK_1 analyzing automatic repair options complete Mandatory Manual Actions ======================== no manual actions available Optional Manual Actions ======================= 1. If file /oracle/data01/test/testtbs01.dbf was unintentionally renamed or moved, restore it Automated Repair Options ======================== Option Repair Description ------ ------------------ 1 Restore and recover datafile 6 Strategy: The repair includes complete media recovery with no data loss Repair script: /oracle/dump/test/diag/rdbms/test/test/hm/reco_2213659725.hm |
In our case the file was not renamed or moved so let’s have a look to the repair script:
RMAN> host 'cat /oracle/dump/test/diag/rdbms/test/test/hm/reco_2213659725.hm'; # restore and recover datafile sql 'alter database datafile 6 offline'; restore datafile 6; recover datafile 6; sql 'alter database datafile 6 online'; host command complete |
You can know use this script, as it is or not, to repair the error and manually close the failure or directly use REPAIR FAILURE that will perform exactly what’s suggested and automatically close the failure (in same RMAN session as LIST FAILURE command).
RMAN> repair failure; Strategy: The repair includes complete media recovery with no data loss Repair script: /oracle/dump/test/diag/rdbms/test/test/hm/reco_2213659725.hm contents of repair script: # restore and recover datafile sql 'alter database datafile 6 offline'; restore datafile 6; recover datafile 6; sql 'alter database datafile 6 online'; Do you really want to execute the above repair (enter YES or NO)? YES executing repair script sql statement: alter database datafile 6 offline Starting restore at 22-jul-2011 12:46:44 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00006 to /oracle/data01/test/testtbs01.dbf channel ORA_DISK_1: reading from backup piece /oracle/fast_recovery_area/TEST/backupset/2011_07_21/o1_mf_nnndf_20110721_175312_72jlw3vm_.bkp channel ORA_DISK_1: piece handle=/oracle/fast_recovery_area/TEST/backupset/2011_07_21/o1_mf_nnndf_20110721_175312_72jlw3vm_.bkp tag=20110721_175312 channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:07 Finished restore at 22-jul-2011 12:46:51 Starting recover at 22-jul-2011 12:46:51 using channel ORA_DISK_1 starting media recovery archived log for thread 1 with sequence 13 is already on disk as file /oracle/fast_recovery_area/TEST/archivelog/2011_07_21/o1_mf_1_13_72k1byoz_.arc archived log for thread 1 with sequence 14 is already on disk as file /oracle/fast_recovery_area/TEST/archivelog/2011_07_21/o1_mf_1_14_72k1f72m_.arc archived log for thread 1 with sequence 15 is already on disk as file /oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_15_72kbc96g_.arc archived log for thread 1 with sequence 16 is already on disk as file /oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_16_72kszc40_.arc archived log for thread 1 with sequence 17 is already on disk as file /oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_17_72lcjmm1_.arc channel ORA_DISK_1: starting archived log restore to default destination channel ORA_DISK_1: restoring archived log archived log thread=1 sequence=12 channel ORA_DISK_1: reading from backup piece /oracle/fast_recovery_area/TEST/backupset/2011_07_21/o1_mf_annnn_20110721_175312_72jmky9m_.bkp channel ORA_DISK_1: piece handle=/oracle/fast_recovery_area/TEST/backupset/2011_07_21/o1_mf_annnn_20110721_175312_72jmky9m_.bkp tag=20110721_175312 channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:03 archived log file name=/oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_12_72lo8xcv_.arc thread=1 sequence=12 channel default: deleting archived log(s) archived log file name=/oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_12_72lo8xcv_.arc RECID=966 STAMP=757169214 archived log file name=/oracle/fast_recovery_area/TEST/archivelog/2011_07_21/o1_mf_1_13_72k1byoz_.arc thread=1 sequence=13 archived log file name=/oracle/fast_recovery_area/TEST/archivelog/2011_07_21/o1_mf_1_14_72k1f72m_.arc thread=1 sequence=14 archived log file name=/oracle/fast_recovery_area/TEST/archivelog/2011_07_22/o1_mf_1_15_72kbc96g_.arc thread=1 sequence=15 media recovery complete, elapsed time: 00:00:04 Finished recover at 22-jul-2011 12:47:00 sql statement: alter database datafile 6 online repair failure complete |
And your test table is back to life:
SQL> SELECT * FROM yjaquier.test1 WHERE rownum<=5; OBJECT_ID ---------- 20 46 28 15 29 |
Missing control files
Should works the same as missing datafiles but here it’s a bit different. Delete all your controlfiles (unless a copy of a remaining one when database is down solve the issue) and issue a switch logfile to generate a write in controlfiles. Then you expect to see an error in RMAN but nothing even if VALIDATE DATABASE see that controlfiles are missing:
RMAN> list failure; no failures found that match specification RMAN> validate database; Starting validate at 25-JUL-11 using channel ORA_DISK_1 channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation input datafile file number=00004 name=/oracle/data01/test/users01.dbf input datafile file number=00003 name=/oracle/rbs/test/undotbs01.dbf input datafile file number=00001 name=/oracle/sys/test/system01.dbf input datafile file number=00002 name=/oracle/sys/test/sysaux01.dbf input datafile file number=00005 name=/oracle/data01/test/users02.dbf input datafile file number=00006 name=/oracle/data01/test/testtbs01.dbf channel ORA_DISK_1: validation complete, elapsed time: 00:05:05 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 1 OK 0 12841 72960 25053958 File Name: /oracle/sys/test/system01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 47093 Index 0 9943 Other 0 3083 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 2 OK 0 14661 69120 25053963 File Name: /oracle/sys/test/sysaux01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 19865 Index 0 21742 Other 0 12852 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 3 OK 0 1 167680 25053963 File Name: /oracle/rbs/test/undotbs01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 0 Other 0 167679 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 4 OK 0 1 1024000 22976000 File Name: /oracle/data01/test/users01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 1013550 Index 0 493 Other 0 9956 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 5 OK 0 95769 1024000 22975999 File Name: /oracle/data01/test/users02.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 917472 Index 0 0 Other 0 10759 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 6 OK 0 4 12801 24040671 File Name: /oracle/data01/test/testtbs01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 186 Index 0 0 Other 0 12610 channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of validate command on ORA_DISK_1 channel at 07/25/2011 16:20:20 ORA-19625: error identifying file /oracle/ctrl/test/control01.ctl ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 3 RMAN> list failure; no failures found that match specification |
If you exit from RMAN session you cant then not initiate it again:
[oracle@server1 ~]$ rman target / catalog rman/rman@rmanrepository Recovery Manager: Release 11.2.0.2.0 - Production on Mon Jul 25 16:44:18 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-00554: initialization of internal recovery manager package failed RMAN-06003: ORACLE error from target database: ORA-00210: cannot open the specified control file ORA-00202: control file: '/oracle/ctrl/test/control01.ctl' ORA-27041: unable to open file Linux-x86_64 Error: 2: No such file or directory Additional information: 3 |
And you are obliged to abort your instance:
SQL> shutdown IMMEDIATE ORA-00210: cannot OPEN the specified control FILE ORA-00202: control FILE: '/oracle/ctrl/test/control01.ctl' ORA-27041: unable TO OPEN FILE Linux-x86_64 Error: 2: No such FILE OR DIRECTORY Additional information: 3 SQL> shutdown abort ORACLE instance shut down. |
Then you can re-issue your RMAN connection and start recovery:
RMAN> startup nomount Oracle instance started Total System Global Area 1068937216 bytes Fixed Size 2233336 bytes Variable Size 801115144 bytes Database Buffers 255852544 bytes Redo Buffers 9736192 bytes RMAN> list failure; no failures found that match specification RMAN> alter database mount; RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of alter db command at 07/25/2011 16:47:36 ORA-00205: error in identifying control file, check alert log for more info RMAN> list failure; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- ------------- ------- 68646 CRITICAL OPEN 25-JUL-11 Control file /oracle/ctrl/test/control02.ctl is missing 68643 CRITICAL OPEN 25-JUL-11 Control file /oracle/ctrl/test/control01.ctl is missing |
Then complete recover is execution of:
RMAN> repair failure; RMAN> recover database; RMAN> sql 'alter database open resetlogs'; |
Remark:
As a new incarnation of your database has been created a new full backup is strongly suggested.
Health Monitor
Introduced with 11g health monitor checks various component of the database:
SQL> SET lines 200 SQL> SET pages 100 SQL> col description FOR a50 SQL> SELECT name, cls_name, description FROM v$hm_check; NAME CLS_NAME DESCRIPTION ---------------------------------------------------------------- --------------- -------------------------------------------------- HM Test CHECK GENERIC CHECK FOR health monitor functionality DB Structure Integrity CHECK PERSISTENT_DATA Checks integrity OF ALL DATABASE files CF Block Integrity CHECK PERSISTENT_DATA Checks integrity OF a control FILE block Data Block Integrity CHECK PERSISTENT_DATA Checks integrity OF a data FILE block Redo Integrity CHECK PERSISTENT_DATA Checks integrity OF redo LOG content Logical Block CHECK PERSISTENT_DATA Checks logical content OF a block TRANSACTION Integrity CHECK PERSISTENT_DATA Checks a TRANSACTION FOR corruptions Undo SEGMENT Integrity CHECK PERSISTENT_DATA Checks integrity OF an undo SEGMENT No Mount CF CHECK PERSISTENT_DATA Checks control FILE IN NOMOUNT MODE Mount CF CHECK PERSISTENT_DATA Checks control FILE IN mount MODE CF Member CHECK PERSISTENT_DATA Checks a multiplexed copy OF the control FILE ALL Datafiles CHECK PERSISTENT_DATA Checks ALL datafiles IN the DATABASE Single Datafile CHECK PERSISTENT_DATA Checks a data FILE TABLESPACE CHECK CHECK PERSISTENT_DATA Checks a TABLESPACE LOG GROUP CHECK PERSISTENT_DATA Checks ALL members OF a LOG GROUP LOG GROUP Member CHECK PERSISTENT_DATA Checks a particular member OF a LOG GROUP Archived LOG CHECK PERSISTENT_DATA Checks an archived LOG Redo Revalidation CHECK PERSISTENT_DATA Checks redo LOG content IO Revalidation CHECK PERSISTENT_DATA Checks FILE accessibility Block IO Revalidation CHECK PERSISTENT_DATA Checks FILE accessibility Txn Revalidation CHECK PERSISTENT_DATA Revalidate corrupted TRANSACTION Failure Simulation CHECK PERSISTENT_DATA Creates dummy failures Dictionary Integrity CHECK PERSISTENT_DATA Checks dictionary integrity ASM Mount CHECK ASM Diagnose mount failure ASM Allocation CHECK ASM Diagnose allocation failure ASM Disk Visibility CHECK ASM Diagnose ADD disk failure ASM FILE Busy CHECK ASM Diagnose FILE DROP failure 27 ROWS selected. |
If we take the example of the missing database chapter:
SQL> SET lines 200 SQL> SET pages 100 SQL> SELECT * FROM (SELECT name, run_mode, start_time, status FROM v$hm_run WHERE check_name='DB Structure Integrity Check' ORDER BY run_id DESC) WHERE rownum<=15; RUN_ID NAME RUN_MODE START_TIME STATUS ---------- -------------------------------- -------- --------------------------------------------------------------------------- ----------- 67506 HM_RUN_67506 REACTIVE 22-JUL-11 11.20.08.051228 AM COMPLETED 67486 HM_RUN_67486 REACTIVE 22-JUL-11 11.10.19.835735 AM COMPLETED 67466 HM_RUN_67466 REACTIVE 22-JUL-11 11.00.19.011142 AM COMPLETED 67446 HM_RUN_67446 REACTIVE 22-JUL-11 10.50.18.307013 AM COMPLETED 67426 HM_RUN_67426 REACTIVE 22-JUL-11 10.46.40.629067 AM COMPLETED 67406 HM_RUN_67406 REACTIVE 22-JUL-11 10.45.21.330512 AM COMPLETED 67386 HM_RUN_67386 REACTIVE 22-JUL-11 10.44.14.952355 AM COMPLETED 67366 HM_RUN_67366 REACTIVE 22-JUL-11 10.40.17.691822 AM COMPLETED 67346 HM_RUN_67346 REACTIVE 22-JUL-11 10.30.16.862629 AM COMPLETED 67326 HM_RUN_67326 REACTIVE 22-JUL-11 10.20.03.262136 AM COMPLETED 66966 HM_RUN_66966 REACTIVE 20-JUL-11 05.18.15.527833 PM COMPLETED 66946 HM_RUN_66946 REACTIVE 20-JUL-11 05.17.04.714570 PM COMPLETED 66926 HM_RUN_66926 REACTIVE 20-JUL-11 04.31.06.729551 PM COMPLETED 66906 HM_RUN_66906 REACTIVE 20-JUL-11 12.07.09.880817 PM COMPLETED 66866 HM_RUN_66866 REACTIVE 20-JUL-11 12.05.46.513194 PM COMPLETED 15 ROWS selected. |
You can have the details of one particular run with:
SQL> SELECT finding_id, name, time_detected, priority, status, TYPE, description, damage_description FROM v$hm_finding WHERE run_id=67326; FINDING_ID NAME TIME_DETECTED PRIORITY STATUS TYPE ---------- -------------------------------- --------------------------------------------------------------------------- -------- ------------ ------------- DESCRIPTION -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- DAMAGE_DESCRIPTION -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 67330 Missing datafile 22-JUL-11 10.20.03.700874 AM HIGH CLOSED FAILURE Datafile 6: '/oracle/data01/test/testtbs01.dbf' IS missing Some objects IN TABLESPACE TESTTBS might be unavailable |
As I generated recommendations with RMAN ADVISE FAILURE commands there is no link in below table but you can display results of ADVISE FAILURE RMAN commands with (run_id is equal to 0):
SQL> SELECT recommendation_id, name, TYPE, status, description, repair_script FROM v$hm_recommendation WHERE fdg_id=67746 ORDER BY recommendation_id DESC; RECOMMENDATION_ID NAME TYPE STATUS ----------------- -------------------------------- ------- ------- DESCRIPTION -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- REPAIR_SCRIPT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 67749 RESTORE FILE MANUAL NOT RUN IF FILE /oracle/data01/test/testtbs01.dbf was unintentionally renamed OR moved, restore it 67747 NO DATA LOSS REPAIR NOT RUN The repair includes complete media recovery WITH no data loss /oracle/DUMP/test/diag/rdbms/test/test/hm/reco_2213659725.hm |
You can generate manual check with Grid Control/Database Control and DBMS_HM PL/SQL package. Manual run are limited to below list of checks:
SQL> SET lines 200 SQL> col description FOR a50 SQL> SELECT name, cls_name, description FROM v$hm_check WHERE internal_check='N'; NAME CLS_NAME DESCRIPTION ---------------------------------------------------------------- --------------- -------------------------------------------------- DB Structure Integrity CHECK PERSISTENT_DATA Checks integrity OF ALL DATABASE files CF Block Integrity CHECK PERSISTENT_DATA Checks integrity OF a control FILE block Data Block Integrity CHECK PERSISTENT_DATA Checks integrity OF a data FILE block Redo Integrity CHECK PERSISTENT_DATA Checks integrity OF redo LOG content TRANSACTION Integrity CHECK PERSISTENT_DATA Checks a TRANSACTION FOR corruptions Undo SEGMENT Integrity CHECK PERSISTENT_DATA Checks integrity OF an undo SEGMENT Dictionary Integrity CHECK PERSISTENT_DATA Checks dictionary integrity ASM Allocation CHECK ASM Diagnose allocation failure 8 ROWS selected. |
Let’s execute the classical database structure check:
SQL> EXEC dbms_hm.run_check('DB Structure Integrity Check','Test_check_1'); PL/SQL PROCEDURE successfully completed. SQL> SET LONG 100000 SQL> SET pages 100 SQL> SELECT dbms_hm.GET_RUN_REPORT('Test_check_1','TEXT','DETAIL') FROM dual; DBMS_HM.GET_RUN_REPORT('TEST_CHECK_1','TEXT','DETAIL') -------------------------------------------------------------------------------- Basic Run Information Run Name : Test_check_1 Run Id : 68302 CHECK Name : DB Structure Integrity CHECK MODE : MANUAL Status : COMPLETED START TIME : 2011-07-25 15:29:15.959688 +02:00 END TIME : 2011-07-25 15:29:16.147916 +02:00 Error Encountered : 0 Source Incident Id : 0 NUMBER OF Incidents Created : 0 Input Paramters FOR the Run Run Findings AND Recommendations |
You can also see them with ADR Command Interpreter (ADRCI) tool:
adrci> show report HM_RUN Test_check_1 <?xml version="1.0" encoding="US-ASCII"?> <HM-REPORT REPORT_ID="Test_check_1"> <TITLE>HM Report: Test_check_1</TITLE> <RUN_INFO> <CHECK_NAME>DB Structure Integrity Check</CHECK_NAME> <RUN_ID>68302</RUN_ID> <RUN_NAME>Test_check_1</RUN_NAME> <RUN_MODE>MANUAL</RUN_MODE> <RUN_STATUS>COMPLETED</RUN_STATUS> <RUN_ERROR_NUM>0</RUN_ERROR_NUM> <SOURCE_INCIDENT_ID>0</SOURCE_INCIDENT_ID> <NUM_INCIDENTS_CREATED>0</NUM_INCIDENTS_CREATED> <RUN_START_TIME>2011-07-25 15:29:15.959688 +02:00</RUN_START_TIME> <RUN_END_TIME>2011-07-25 15:29:16.147916 +02:00</RUN_END_TIME> </RUN_INFO> <RUN_PARAMETERS/> <RUN-FINDINGS/> </HM-REPORT> |
Recover deleted files on Linux
This may work on other Unix/Linux than RedHat…
If your instance is not aborting for an internal error with something like:
2011-07-22 15:26:42.907000 +02:00 Errors in file /oracle/dump/test/diag/rdbms/test/test/trace/test_dbw1_3844.trc: ORA-01116: error in opening database file 6 ORA-01110: data file 6: '/oracle/data01/test/testtbs01.dbf' ORA-27041: unable to open file Linux-x86_64 Error: 2: No such file or directory Additional information: 3 System state dump requested by (instance=1, osid=3844 (DBW1)), summary=[abnormal instance termination]. System State dumped to trace file /oracle/dump/test/diag/rdbms/test/test/trace/test_diag_3834.trc DBW1 (ospid: 3844): terminating the instance due to error 1116 2011-07-22 15:26:44.983000 +02:00 Dumping diagnostic data in directory=[cdmp_20110722152642], requested by (instance=1, osid=3844 (DBW1)), summary=[abnormal instance termination]. Instance terminated by DBW1, pid = 3844 |
You may recover the missing datafile with following procedure (if database aborted or restarted you must recover the datafile !):
Locate the file descriptor of the database writer process (use lsof for Unix):
[root@server1 ~]# ps -ef | grep dbw | grep test oracle 25097 1 0 15:37 ? 00:00:00 ora_dbw0_test oracle 25099 1 0 15:37 ? 00:00:00 ora_dbw1_test oracle 25101 1 0 15:37 ? 00:00:00 ora_dbw2_test oracle 25103 1 0 15:37 ? 00:00:00 ora_dbw3_test [root@server1 ~]# ls -l /proc/25097/fd | grep testtbs01.dbf lrwx------ 1 oracle dba 64 Jul 22 16:01 258 -> /oracle/data01/test/testtbs01.dbf (deleted) [root@server1 ~]# lsof -p 25097 | grep testtbs01.dbf oracle 25097 oracle 258uW REG 253,12 104865792 147460 /oracle/data01/test/testtbs01.dbf (deleted) [root@server1 ~]# ls -l /proc/25097/fd/258 lrwx------ 1 oracle dba 64 Jul 22 16:01 /proc/25097/fd/258 -> /oracle/data01/test/testtbs01.dbf (deleted) |
Create a symbolic link to be able to put the tablespace in read only mode:
[oracle@server1 ~]$ ln -s /proc/25097/fd/258 /oracle/data01/test/testtbs01.dbf [oracle@server1 ~]$ ls -l /oracle/data01/test/testtbs01.dbf lrwxrwxrwx 1 oracle dba 18 Jul 25 11:42 /oracle/data01/test/testtbs01.dbf -> /proc/25097/fd/258 |
Put the tablespace read only:
SQL> ALTER TABLESPACE testtbs read only; TABLESPACE altered. |
Remove the symbolic lind and copy the file descriptor to original missing file:
[oracle@server1 ~]$ rm /oracle/data01/test/testtbs01.dbf [oracle@server1 ~]$ cp /proc/25097/fd/258 /oracle/data01/test/testtbs01.dbf [oracle@server1 ~]$ ll /oracle/data01/test/testtbs01.dbf -rw-r----- 1 oracle dba 104865792 Jul 22 16:13 /oracle/data01/test/testtbs01.dbf |
Put back again tablespace in read write mode (alert will even disappear from RMAN):
SQL> ALTER TABLESPACE testtbs read WRITE; TABLESPACE altered. |
At that stage all is working abck again, but the file descriptor still see the datafile deleted, to recover this:
SQL> ALTER TABLESPACE testtbs OFFLINE; TABLESPACE altered. SQL> recover datafile '/oracle/data01/test/testtbs01.dbf'; Media recovery complete. SQL> ALTER TABLESPACE testtbs ONLINE; TABLESPACE altered. |
References
- Data Recovery Advisor -Reference Guide. [ID 466682.1]
- Data Recovery Advisor – Corruption Reference Guide [ID 1317849.1]
- Data Recovery Advisor in Oracle Database 11g Release 1
- Retrieve deleted files on Unix / Linux using File Descriptors [ID 444749.1]
- How to Recover Deleted Oracle Datafiles with No Downtime
jim boles says:
great tutorial
thx