For DBAs running production databases on professional grade storage systems, corruptions are rare. Indeed EMC arrays include additional technology to actively prevent silent data corruption and bit rot.
However corruption does still occur for a number of different reasons, including hardware failure, software error and sometimes user error. When it does occur, knowing how to properly identify the affected objects, and how in some cases to rescue usable data, can be important time savers to the production Oracle DBA.
In the following post we look at a step by step approach to isolating corruption, verifying it, and then rescuing what data we can. The following was tested on Oracle 10.2.0.2 on a Windows platform.
Identify the Corruption (dbv)
The typical method to identify block corruption in an Oracle data file is to use the dbv utility to scan the data file for errors. The following example is taken from Windows:
C:\ >dbv file=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\MANA_DATA01.DBF blocksize=8192
DBVERIFY: Release 10.2.0.2.0 - Production on Mon Sep 13 14:29:22 2010
Copyright (c) 1982, 2005, Oracle. All rights reserved.
DBVERIFY - Verification starting : FILE = C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\MANA_DATA01.DBF
DBV-00200: Block, dba 17268885, already marked corrupted
DBVERIFY - Verification complete
Total Pages Examined : 1685504
Total Pages Processed (Data) : 1295633
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 84554
Total Pages Failing (Index): 0
Total Pages Processed (Other): 29475 Total Pages Processed (Other): 29475
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 275842
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Highest block SCN : 100371677 (0.100371677)
In the above example we have found corruption at data block address (DBA) 17268885.
Identify the Corruption (rman)
An alternative method to identify block corruption in an Oracle data file is to use the RMAN validate function. The following example is taken from Windows:
RMAN> connect target sys/****
connected to target database: TAXPROD (DBID=3492187718)
RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE FILESPERSET=10;
Starting backup at 13-SEP-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=485 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\MANA_DATA01.DBF
input datafile fno=00005 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\MANA_NDX01.DBF
input datafile fno=00002 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\UNDOTBS01.DBF
input datafile fno=00001 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\SYSTEM01.DBF
input datafile fno=00003 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\SYSAUX01.DBF
input datafile fno=00006 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\USERS01.DBF
channel ORA_DISK_1: backup set complete, elapsed time: 00:05:55
Finished backup at 13-SEP-10
RMAN>
Once the validate completes, the V$DATABASE_BLOCK_CORRUPTION can be used to check for corruption:
SQL> select * from V$DATABASE_BLOCK_CORRUPTION;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
4 491669 1 0 CORRUPT
Look up the Data Block Address (DBA)
Having identified where the corruption exists as a data block address, we can now determine which data file and block number that translates to. Using the dbms_utility.data_block_address PL/SQL package we can convert the DBA into something usable:
select
dbms_utility.data_block_address_file(17268885) RFN,
dbms_utility.data_block_address_block(17268885) BL
from dual;
RFN BL
---------- ----------
4 491669
We now know that DBA 17268885 is file 4, block 491669
Look up the Object
We can now look up the object(s) where the corruption is occurring, by using the data file and block numbers from the previous step:
select /*+ RULE CURSOR_SHARING_EXACT */
substrb(dbe.owner,1,15) DB_OWNER,
substrb(dbe.segment_name,1,30 ) OBJ_NAME,
substrb(dbe.partition_name,1,20 ) PART_NAME,
substrb(dbe.segment_type,1,3) TYP,
substrb(dbe.tablespace_name,1,10) TS_NAME
from dba_extents dbe
where dbe.file_id = 4
and 491669 between dbe.block_id and dbe.block_id+dbe.blocks-1
/
DB_OWNER OBJ_NAME PART_NAME TYP TS_NAME
--------------- ------------------------------ -------------------- --- ----------
MANA TAXINQUIRY_LOG TAB MANA_DATA
In the above example we have determined that file 4 and block 491669 is the table MANA.TAXINQUIRY_LOG.
Look up the Object – Extensive Corruption
If there are a great many corrupt blocks, looking them up one at a time is time consuming. In this case the script below generates a list of corrupt tables and indexes:
with mytable as (
select
dbe.owner DB_OWNER,
dbe.segment_name OBJ_NAME,
dbe.partition_name PART_NAME,
dbe.segment_type TYP,
dbe.tablespace_name TS_NAME
from
dba_extents dbe,
v$database_block_corruption vdbc
where 1=1
and dbe.file_id = vdbc.file#
and vdbc.block# between dbe.block_id and dbe.block_id+dbe.blocks-1
)
select distinct
db_owner,
obj_name,
typ
from mytable;
DB_OWNER OBJ_NAME TYP
--------------- ------------------------------ -----
DATAWARE TB_CUSTOM_AGING_SBL_CREDIT TABLE
DBMIRROR ASLOPINV TABLE
DBMIRROR BD100P01 TABLE
DBMIRROR BD100P01_PRE TABLE
DBMIRROR BD100P02 TABLE
DBMIRROR BD100P02_PRE TABLE
DBMIRROR BD100P03 TABLE
DBMIRROR BD100P03_PRE TABLE
DBMIRROR BD100P05_PRE TABLE
DBMIRROR BD100P06 TABLE
DBMIRROR BH100P01 TABLE
DBMIRROR BRCHDR TABLE
DBMIRROR EHRCVF TABLE
DBMIRROR FACARS TABLE
DBMIRROR FAUPDT TABLE
DBMIRROR IDCSOUT TABLE
DBMIRROR LODDTL TABLE
DBMIRROR LODHDR TABLE
DBMIRROR OFBLDT TABLE
DBMIRROR OFBLHD TABLE
DBMIRROR OFIBSGL TABLE
DBMIRROR PDTOGL TABLE
DBMIRROR PK_BRCHDR INDEX
DBMIRROR PK_LODDTL INDEX
DBMIRROR PK_LODHDR INDEX
DBMIRROR SYS_C001047026 INDEX
DBMIRROR T2FTPS TABLE
DBMIRROR WEEUTL_DWH TABLE
MIRRORD DRSESSIONHISTORY TABLE
Verify
We can now verify the information by doing a select count(*) from the identified tables:
SQL> select count(*) from MANA.TAXINQUIRY_LOG;
select count(*) from MANA.TAXINQUIRY_LOG
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 491669)
ORA-01110: data file 4:
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\TAXPROD\MANA_DATA01.DBF'
The statement fails with Oracle reporting corruption in the data file. This is expected.
Extract Metadata
Before we can extract the data from the corrupted object, we need to create a matching table. The best option to do this is to use the dbms_metadata package to fully define the corrupted table. This will include indexes and referential integrity constraints as well as any privilege data.
In the following example we see this is a simple table with no indexes or referential integrity.
set heading off;
set echo off;
set pages 999;
set long 90000;
SQL> select dbms_metadata.get_ddl('TABLE','TAXINQUIRY_LOG','MANA') from dual;
CREATE TABLE "MANA"."TAXINQUIRY_LOG"
( "LOG_ID" NUMBER(10,0) NOT NULL ENABLE,
"PROCESS_DATE" VARCHAR2(50) NOT NULL ENABLE,
"EVENT" VARCHAR2(100) NOT NULL ENABLE,
"LOG_INFO" VARCHAR2(300) NOT NULL ENABLE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 19922944 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MANA_DATA"
Create Rescue Table
Using the meta data extracted in the previous step we can now create a new table to hold the original data. The table name is changed to a new name to allow both tables to co-exist:
SQL> alter session set current_schema=MANA;
Session altered.
CREATE TABLE MANA.TAXINQUIRY_LOG_REPAIR
( "LOG_ID" NUMBER(10,0) NOT NULL ENABLE,
"PROCESS_DATE" VARCHAR2(50) NOT NULL ENABLE,
"EVENT" VARCHAR2(100) NOT NULL ENABLE,
"LOG_INFO" VARCHAR2(300) NOT NULL ENABLE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 19922944 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "MANA_DATA";
Extract Usable Data
We now need to move the data we can still extract to the new table. To do this we need to instruct Oracle to ignore the corrupt blocks and then copy the data to the new table. The old table is then dropped and the new table renamed to the old name.
SQL> exec DBMS_REPAIR.SKIP_CORRUPT_BLOCKS( 'MANA', 'TAXINQUIRY_LOG' );
PL/SQL procedure successfully completed.
SQL> insert into MANA.TAXINQUIRY_LOG_REPAIR ( select * from MANA.TAXINQUIRY_LOG );
68 rows created.
SQL> commit;
Commit complete.
SQL> drop table MANA.TAXINQUIRY_LOG;
Table dropped.
SQL> alter table TAXINQUIRY_LOG_REPAIR rename to TAXINQUIRY_LOG;
Table altered.
SQL> alter system flush shared_pool;
System altered.
If additional indexes or referential integrity existed these should now be added back to the new table. If the table is large it should also be analyzed.
The shared pool is then flushed to ensure no old SQL plans exist against the dropped table.
Skipping Corrupt Blocks
The event 10231 allows Oracle to skip corrupt blocks on a scanning operation. This can be useful when recovery data.
SQL> alter session set events '10231 trace name context forever, level 10';
SQL> execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('SCOTT','EMP');
Now you should be able to issue a CREATE TABLE AS SELECT operation against the corrupt table to extract data from all non-corrupt blocks, or EXPORT the table.
Eg: CREATE TABLE salvage_emp AS SELECT * FROM corrupt_emp;
To clear the attribute for a table use:
SQL> exec DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('SCOTT','EMP',flags=>dbms_repair.noskip_flag);
No comments:
Post a Comment