Monday, April 28, 2014

ORA-00922: missing or invalid option

Ok, I guess I have to blog about this after facing this number of times.

Whenever your expdp throws an error like this below, then reduce your parallel option in your export and try again that should resolve it.

ORA-31693: Table data object "SCHEMA_NAME"."TABLE_NAME" failed to load/unload and is being skipped due to error:
ORA-00922: missing or invalid option

Thursday, April 24, 2014

expdp fails with ora-31693 ora-01555 snapshot too old

Was trying to export (expdp on 10.2.0.4 database) a table which is about 251GB of size with a BLOB column.

The export runs and runs for 15+ hours and then finally fails with:

ORA-31693: Table data object "SCHEMA_OWNER"."LOB_TABLE" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-01555: snapshot too old: rollback segment number 384 with name "_SYSSMU384$" too small

??? took about 15+ hours to report this???

Reading this metalink Doc ID 787004.1 and it make sense that we might have some corrupted blocks. So, running that pl/sql code given in that article (running since an hour and I already have a bad feeling abt it :( ) to see if it reports any corrupted records and see if I can by pass those and get this exported.

Will update the findings shortly...

Wednesday, April 16, 2014

GoldenGate Splitting an existing table replicat group into multiple groups

Ok, the reason behind this would be no brainier as you obviously wanted to increase your replicat process to catch up heavy DML's.

As mentioned in earlier post, even after splitting the tables from one schema into 4 different groups, I still see that one table is a real killer (with about 40K records in 30 seconds. go figure) and its dying to catch up. So, how can I make this faster to catch up and reduce the lag time!!!

Well, its the very same approach as in my previous post but on thing that is different in this setup is:
In the new group prm file that you are adding, add the RANGE function like showed below:

Map schema_name.table_name , Target schema_name.table_name, FILTER (@RANGE (2, 3, Key_Column)) ; 

In the above, Range parameters 2 says its the 2nd group and 3 says that there 3 groups total for this table_name and Key_Column is an optional.

Note: This is best if you do it at Extract process instead of Replicat (read below).

Excerpt from Oracle Docs (if you were like me to search for the Doc :) ):
RANGE
Use the @RANGE function to divide the rows of any table across two or more Oracle
GoldenGate processes. It can be used to increase the throughput of large and heavily
accessed tables and also can be used to divide data into sets for distribution to different
destinations. Specify each range in a FILTER clause in a TABLE or MAP statement.
@RANGE is safe and scalable. It preserves data integrity by guaranteeing that the same row
will always be processed by the same process group.
@RANGE computes a hash value of the columns specified in the input. If no columns are
specified, the KEYCOLS clause of the TABLE or MAP statement is used to determine the columns
to hash, if a KEYCOLS clause exists. Otherwise, the primary key columns are used.
Oracle GoldenGate adjusts the total number of ranges to optimize the even distribution
across the number of ranges specified.
Because any columns can be specified for this function, rows in tables with relational
constraints to one another must be grouped together into the same process or trail to
preserve referential integrity.
NOTE Using Extract to calculate the ranges is more efficient than using Replicat.
Calculating ranges on the target side requires Replicat to read through the entire
trail to find the data that meets each range specification.
Syntax @RANGE (, [, ] [, ] [, ...])
Example 1 In the following example, the replication workload is split into three ranges (between three
Replicat processes) based on the ID column of the source acct table.
(Replicat group 1 parameter file)
MAP sales.acct, TARGET sales.acct, FILTER (@RANGE (1, 3, ID));
(Replicat group 2 parameter file)
MAP sales.acct, TARGET sales.acct, FILTER (@RANGE (2, 3, ID));
(Replicat group 3 parameter file)
MAP sales.acct, TARGET sales.acct, FILTER (@RANGE (3, 3, ID));
Argument Description
The range assigned to the specified process or trail. Valid values are 1, 2,
3, and so forth, with the maximum value being the value defined by ranges>.
The total number of ranges allocated. For example, to divide data into
three groups, use the value 3.
The name of a column on which to base the range allocation. This
argument is optional. If not used, Oracle GoldenGate allocates ranges
based on the table’s primary key.Column Conversion Functions
STRCAT
Oracle GoldenGate Windows and UNIX Reference Guide 462
..............................................................................
Example 2 In the following example, one Extract process splits the processing load into two trails.
Since no columns were defined on which to base the range calculation, Oracle GoldenGate
will use the primary key columns.
RMTTRAIL /ggs/dirdat/aa
TABLE fin.account, FILTER (@RANGE (1, 2));
RMTTRAIL /ggs/dirdat/bb
TABLE fin.account, FILTER (@RANGE (2, 2));
Example 3 In the following example, two tables have relative operations based on an order_ID column.
The order_master table has a key of order_ID, and the order_detail table has a key of order_ID and
item_number. Because the key order_ID establishes relativity, it is used in @RANGE filters for
both tables to preserve referential integrity. The load is split into two ranges.
(Parameter file #1)
MAP sales.order_master, TARGET sales.order_master,
FILTER (@RANGE (1, 2, order_ID));
MAP sales.order_detail, TARGET sales.order_detail,
FILTER (@RANGE (1, 2, order_ID));
(Parameter file #2)
MAP sales.order_master, TARGET sales.order_master,
FILTER (@RANGE (2, 2, order_ID));
MAP sales.order_detail, TARGET sales.order_detail,
FILTER (@RANGE (2, 2, order_ID));

GoldenGate Splitting an existing group

There are times when you realize that you should have done things in different way than the way they are done.

This is the best example of it.

As mentioned in my earlier posts I am working on setting up GoldenGate for about 40+ TB database for migration purpose and recently defined the groups based on the schemas within the database as were done in other databases.

All was good except when noticed that there was one group which is taking way too long to catch up (too much of lag time) since the Replicat started.

Looking more into that schema, found that there are about 3 tables in the schema which has too much of DML's going on compare to others. - This is when felt that I should have grouped this schema into 4 groups instead of combining all together... well well well enough blogging. Lets get the point as to how to divide this existing group now into 4.

First, stop the group and run info grp#, detail --> Note the SEQ# and RBA# for this group where it stopped applying the changes.

GGSCI () 8> !
info rrhp02, detail

REPLICAT   RRHP02    Last Started 2014-04-10 10:24   Status STOPPED
Checkpoint Lag       77:23:51 (updated 00:18:03 ago)
Log Read Checkpoint  File dirdat/rb000560 --> SEQ#
                     2014-04-11 15:16:39.009360  RBA 1878390122 --> RBA#

Now you got those two values, go ahead create the group(s) prm and oby (if you maintain oby files for future reference) files.

Obey File Content:
Add Replicat RRHP02A, ExtTrail dirdat/rb, extseqno 560 , extrba 1878390122

As you see above I am dividing this group into 2A, 2B and 2C and thus the group name RRHP02A and note the extseqno and extrba values.

Edit the prm file for this group and a table(s) that you wanted in this group along with other settings.

Perform the above for groups 2B and 2C and DO NOT forget to remove the tables entries from the original group 2 param file that you are moving into these new groups.

Once the files are ready, add them to goldengate:

GGSCI () 9> obey diroby/add_rrhp02a.oby

GGSCI () 10> Add Replicat RRHP02a, ExtTrail dirdat/rb, extseqno 560 , extrba 1878390122

REPLICAT added.

GGSCI () 25> start rrhp02* --> starting all the groups with 2*

Sending START request to MANAGER ...
REPLICAT RRHP02A starting

GGSCI () 29> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     RRHP02      77:57:41      00:00:00
REPLICAT    RUNNING     RRHP02A     77:59:05      00:00:05
REPLICAT    RUNNING     RRHP02B     77:59:02      00:00:09
REPLICAT    RUNNING     RRHP02C     77:59:08      00:00:03


You just split your existing group into multiple groups and utilized the load balancing feature of GoldenGate.

Tuesday, April 8, 2014

GoldenGate Sequence Sync with "row cache lock"

Had real trouble with this one.
Setting up GoldenGate replication for 40+ TB database in the process of migration and GG Replicat was just getting hung (well not really but runs way too slow that takes hours to move to next transaction). 

Checking in the background processes the only thing that I notice with GGAdmin user session has way too much of wait event on "row cache lock"!!!

Kept on refreshing to current sql that its being executed and only thing that its doing is:
select seq.nextval from dual;
Along with:
BEGIN
   ggadmin.replicateSequence (TO_NUMBER (702400001),
                              TO_NUMBER (20),
                              TO_NUMBER (1),
                              'SCHEMA_OWNER',
                              TO_NUMBER (100000),
                              'LOGID',
                              UPPER ('ggadmin'),
                              TO_NUMBER (2),
                              TO_NUMBER (0),
                              '');
END;

The above statements are repeatedly seen and made me to check on the sequence cache size to see if that is enough because the Target database is running on 2 node cluster. Well, cache size for this is set to 100000 which is quite normal in our database (source database runs in 7 node cluster) as these are RAC databases and the higher the cache is lesser "row cache locks". So, why am I still getting this "row cache lock" in Target even though the cache size is 100000???

Well, I read this reading:
"When using higher cache values the problem of hard parses gets somewhat mitigated but there is another issue. When replicating such a sequence GoldenGate follows these steps: 
  1. Sets sequence to nocache
  2. Executes sequence.nextval until it reaches the target value
  3. Restores altered sequence properties
So if you have, say, a sequence with cache 1000 then each time a new value gets written into seq$.highwater on the source, GoldenGate is going to execute sequence.nextval one thousand times on the destination! As you can imagine this aren't performing particularly fast as getting every next value will result in Oracle updating actual row in the data dictionary. All of the above means that replicating sequences can sometimes put quite a bit of strain on the system. A much faster (and simpler) approach would be to use step to reach target sequence value instead of fetching it on-by-one in nocache mode. Last but not least, you can end up with nocache sequence if procedure crashes in the middle."

Well, that is a problem. So, the only solution that I can think of at this time is to remove Syncing the Sequences option for this group and restart the group. Now, no more row cache locks and will take care of syncing the sequences at one shot for all sequences at the time of cut-over.