block corruption and rman – for the new dba





Metalink note 28814.1 "Handling Oracle Block Corruptions in Oracle7/8/8i/9i/10g".

http://luhartma.blogspot.com/2006/04/how-to-check-for-and-repair-block.html

Problem: the application encounters an ORA-01578 runtime error because there are one or more corrupt blocks in a table it is reading.

How can corrupt blocks be caused?
First of all we have two diffent kinds of block corruption:
– physical corruption (media corrupt)
– logical corruption (soft corrupt)
Physical corruption can be caused by defected memory boards, controllers or broken sectors on a hard disk;
Logical corrution can amoung other reasons be caused by an attempt to recover through a NOLOGGING action.
There are two initialization parameters for dealing with block corruption:
– DB_BOCK_CHECKSUM (calculates a checksum for each block before it is written to disk, every time)
causes 1-2% performance overhead
– DB_BLOCK_CHECKING (serverprocess checks block for internal consistency after every DML)
causes 1-10% performance overhead

If performance is not a big issue then you should use these!

Normally RMAN checks only for physically corrupt blocks
with every backup it takes and every image copy it makes.
This is a common misunderstanding amoung a lot of DBAs.
RMAN doesn not automatically detect logical corruption by default!
We have to tell it to do so by using CHECK LOGICAL!

I recommend you to check your database for corrupt blocks
with RMAN on a regular basis, proactively.
If you do so you RMAN finds out about block corruptions
before your application runs into an ORA-01578 and
before you find out that you have backed up the corrupt blocks again and again.

There have been incidents when DBAs found out
that they did not have a backup with the un-corruted block any more,
because you have deleted the last one with a not corrupted version.
They could not recover the block any more!

Starting with Oracle 9i we can use RMAN
to check a database for both physically and logically corrupt blocks.

RMAN> backup validate check logical database;

RMAN does not physically backup the database with this command
but it reads all blocks and checks for corruptions.
If it finds corrupted blocks it will place the information about the corruption into a view:

SYS @ orcl AS SYSDBA SQL > select * from v$database_block_corruption;

Now we can tell RMAN to recover all the blocks
which it has found as being corrupt:
RMAN> blockrecover corruption list;
# (all blocks from v$database_block_corruption)

You can also choose to do the following:

RMAN> BACKUP VALIDATE DATAFILE 1;

This will check that one file. You could also check all the files and archivelog files with:

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL;

After the validate has finished, query V$DATABASE_BLOCK_CORRUPTION for records.

However, as davey pointed out, this is your system tablespace’s datafile, a TAR would be the best way to go.

set maxcorrupt for datafile 1 to …..

Handling Corrupt Datafile Blocks in RMAN Backup: MAXCORRUPT

When RMAN encounters a corrupt datafile block during a backup, the behavior depends upon whether RMAN has encountered this corrupt block during a previous backup. If the block is already identified as corrupt, then it is included in the backup. If the block is not previously identified as corrupt, then RMAN’s default behavior is to stop the backup.

You can override this behavior using the SET MAXCORRUPT command with BACKUP in a RUN block. Setting MAXCORRUPT allows a specified number of previously undetected block corruptions in datafiles during the execution of an RMAN BACKUP command. If RMAN detects more than this number of new corrupt blocks while taking the backup, then the backup job aborts, and no backup is created.

As RMAN finds corrupt blocks during the backup process, it writes the corrupt blocks to the backup with a special header indicating that the block has media corruption. If the backup completes without exceeding the specified MAXCORRUPT limit, then the database records the address of the corrupt blocks and the type of corruption found (logical or physical) in the control file. You can access these records through the V$DATABASE_BLOCK_CORRUPTION view.

If the backup job aborts because more than MAXCORRUPT corrupt blocks are found, theV$DATABASE_BLOCK_CORRUPTION view is not populated, because the information used to populate the view is only available if a backup is successfully created. In such a situation, you can run BACKUP VALIDATE on the datafiles to be backed up, to populate V$DATABASE_BLOCK_CORRUPTION and use block media recovery to repair the corrupt blocks. See Oracle Database Backup and Recovery Basics for details on using BACKUP… VALIDATE, and "Performing Block Media Recovery with RMAN" for more details on block media recovery.

Another way to persistently skip tablespaces across RMAN sessions is to issue the

CONFIGURE EXCLUDE command for each tablespace that you always want to skip.

For example, you may always want to skip the example tablespace, which has

been made read-only. You can then issue:

CONFIGURE EXCLUDE FOR TABLESPACE example;

Author: admin