Friday, May 31, 2013

RMAN Slow performance with "control file sequential read" wait event

We have a database which is about 40TB and the control file is about 900MB and we have about 1.5TB Archivelogs generation everyday.

So, was thinking it was because of all the above why "show all" takes about 45 minutes in RMAN...

But, for some reason I was not able to convince my self that it should take that long for just SHOW ALL command!!!.
Not only that, it takes way too long to just allocate channels and that was making our backups to take for ever...

Anyways, digging deeper found the following sql that just sits in there with "control file sequential read" wait event... This made me to hunt for the sql:

SELECT RECID, STAMP, TYPE OBJECT_TYPE, OBJECT_RECID, OBJECT_STAMP, OBJECT_DATA,
       TYPE, (CASE
                WHEN TYPE = 'DATAFILE RENAME ON RESTORE' THEN DF.NAME
WHEN TYPE = 'TEMPFILE RENAME' THEN TF.NAME
ELSE TO_CHAR (NULL)
 END)
  OBJECT_FNAME, (CASE
WHEN TYPE = 'DATAFILE RENAME ON RESTORE' THEN DF.CREATION_CHANGE#
WHEN TYPE = 'TEMPFILE RENAME' THEN TF.CREATION_CHANGE#
ELSE TO_NUMBER (NULL)
 END)
    OBJECT_CREATE_SCN, SET_STAMP, SET_COUNT
  FROM V$DELETED_OBJECT, V$DATAFILE DF, V$TEMPFILE TF
 WHERE OBJECT_DATA = DF.FILE#(+)
   AND OBJECT_DATA = TF.FILE#(+)
   AND RECID BETWEEN :b1 AND :b2
   AND (STAMP >= :b3 OR RECID = :b2)
   AND STAMP >= :b4
 ORDER BY RECID

That brought me into this metalink id# Rman Backup Very Slow After Mass Deletion Of Expired Backups (Disk, Tape) [ID 465378.1]

Work around solution made SHOW ALL to return the results in seconds from 45 minutes...

*** Please check with Oracle Support prior to deploying this workaround in your database to make user it is OK ***


Solution

Bug is fixed in 11g but until then, use the workaround of clearing the controlfile section which
houses v$deleted_object:
SQL> execute dbms_backup_restore.resetcfilesection(19);
Then clear the corresponding high water mark in the catalog :
SQL> select * from rc_database;
     --> note the 'db_key' and 'dbinc_key' of your target based on dbid

For pre-11G catalog schemas:

SQL> update dbinc set high_do_recid = 0 where db_key = '' and dbinc_key=;
SQL> commit;

For 11G+ catalog schemas:

SQL> update node set high_do_recid = 0 where db_key = '' and dbinc_key=;
SQL> commit;

ORA-00845: MEMORY_TARGET not supported on this system

We had one of our RAC Node crashed and all databases shutdown and as usual they should come back once the ASM is up.

But this one database would not start at all.

Grepping for PMON for that database returns nothing:

So, trying to start manually:

SQL*Plus: Release 11.2.0.2.0 Production on Mon May 27 04:55:22 2013

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORA-00845: MEMORY_TARGET not supported on this system

Now that is weird, why would I get that? /dev/shm has no space? Really? but why? database is not started yet to utilize that!!!!

Grepping for database name gives me:

db2:11.2.0:node2:/opt/oracle>ps -ef |grep db2
oracle     493     1  0 Apr28 ?        00:00:00 oracledb2 (LOCAL=NO)
oracle   24841     1  0 04:51 ?        00:00:00 oracledb2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   24846     1  0 04:51 ?        00:00:00 oracledb2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle   31322 26885  0 05:02 pts/1    00:00:00 grep db2
oracle   32553     1  0 Apr28 ?        00:00:00 oracleras2 (LOCAL=NO)

Hmm... I get nothing when I grep for PMON but there is still something running in the backgroud as hung process!!!

Time to kill those process...
db2:node2:/opt/oracle>kill -9 493 24841 24846 32553

Now do startup and database comes up without any complaints...





RMAN-20035: invalid high recid

While moving our database backups to use new version of HP Data Protector software, I got a hick up on one database with "RMAN-20035: invalid high recid" error.

RMAN-03014: implicit resync of recovery catalog failed
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20035: invalid high recid

Also, we were moving our RMAN catalog from one database to another database to resolve another performance issue. So, it might be due to sync issue.

Solution to that was to un-register the database from RMAN catalog and re-register.

More on this topic/issue in metalink id#273446.1

Note, you can unregister using dbms_rcvcat.unregisterdatabase procedure also but I am showing what I did.

To Unregister:
Connect to your RMAN catalog as how you generally do take your db backup and run:

RMAN> unregister database;

To Register:
RMAN> register database;

After this I was able to backup the database successfully...

Friday, May 10, 2013

Finding Bind variables in SQL's executed...

v$sql_bind_capture --> this view shows the bind variables passed to the sql at the run time
DBA_HIST_SQLBIND --> historical information

Ex:

select b.last_captured, a.sql_text, b.name, b.position, b.datatype_string, b.value_string
  from dba_hist_sqlbind b,
       --gv$sql_bind_capture b,
       v$sqlarea a
 where b.sql_id = '66j3kkqnv3m2f' --> sqlid that you are interested in looking for bind variable passed.
   and b.sql_id = a.sql_id

From Oracle Docs on DBA_HIST_SQLBIND:

DBA_HIST_SQLBIND displays historical information on bind variables used by SQL cursors. This view contains snapshots of V$SQL_BIND_CAPTURE.
LAST_CAPTUREDDATE Date when the bind value was captured. Bind values are captured when SQL statements are executed. To limit the overhead, binds are captured at most every 15 minutes for a given cursor.

Wednesday, May 8, 2013

RMAN Connect error with 11g

Was trying to connect to 11g database using RMAN and getting this error:

/opt/oracle/etc>rman target DP_BACKUP/pass@db_name


RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00554: initialization of internal recovery manager package failed
RMAN-04005: error from target database:
ORA-01017: invalid username/password; logon denied

The userid/password works fine if I connect using sqlplus as showed below!!!

/opt/oracle/etc>sqlplus DP_BACKUP/pass@db_name

SQL*Plus: Release 11.1.0.7.0 - Production on Wed May 8 09:59:03 2013

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning and Real Application Clusters options

SQL>

User DP_BACKUP has SYSDBA role assigned 
SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
DP_BACKUP                      TRUE  FALSE FALSE

But still errors out!!!

Reason is the password is not in sync with the password file. So, either re-create the password file with appropriate password or change the password for DP_BACKUP user.