ORA-04031,ORA-12801

 ORA-04031: unable to allocate 65560 bytes of shared memory

("large pool","unknown object","large pool","PX msg pool")

Last week we got a call from application team and they said “one of the production jobs failed, we tried several times & it was failed again. Also they sent the error details to us.
Error details

ORA-12801: error signaled in parallel query server P055 ORA-12853: insufficient memory for PX buffers: current 66592K, max needed 178560K ORA-04031: unable to allocate 65560 bytes of shared memory ("large pool","unknown object","large pool","PX msg pool") 
We have check database load & check the parallel session details. There is no parallel session running on database.
We have few questions to application team and they answered
1.    This one is new job on production?
Ans: No. It’s running on production from last 2 years.
2.    Do you have faced the same issue before?
Ans: No.
3.    How longs it’s take to complete?
Ans: Less than 10 minutes
We asked to rerun the job & monitored the job session using Quest toad. That job spawned 97 parallel sessions and it’s failed again.
We have checked the PARALLEL_MAX_SERVERS & value of this parameter is 120.
We asked the job details and check the procedure which is involving on the job.
That procedure performs
1.    Drop the indexes
2.    Load the data
3.    Create the indexes
4.    Gather the statistics.
While gathering the statistics that job was failed and application team passed the arguments to procedure.
We have manually gathered the statistics successfully and ask application team to proceed the next step of the job.
But they don’t handling the exception for this job failure and the next job is depending upon the current job status.
we analyzed the procedure and it’s used below SQL code for stats gathering.
SQL Statement:
_ddl_line :=  'BEGIN DBMS_STATS.GATHER_TABLE_STATS('
        || 'ownname => '         || '''' || p_towner || ''''
       || ',tabname => '        || '''' || p_tname  || ''''
       || ',method_opt => '   || '''FOR ALL COLUMNS SIZE AUTO'''
       || ',granularity => '   || '''ALL'''
       || ',cascade => TRUE'
       || ',degree => DBMS_STATS.DEFAULT_DEGREE ); END;';

ISSUE Details:
Above stats gather statement caused the issue on, “degree => DBMS_STATS.DEFAULT_DEGREE” value spawned multiple parallel threads (around 97 parallel sessions) on database. Due to this job was failed. That table having 2, 00,000 rows & why oracle optimizer spawned 97 parallel sessions for a small table. Sometimes Oracle 11g R1 optimizer behavior was very worst :-) 
 
Why it was previously running successfully? – Application team asked the root cause.

we have checked the oracle memory components value.  LARGE_POOL init parameter was not configured on the instance. While stats gathering oracle processes used the SHARED_POOL.

We have 2 solutions for this issue.

1)      It is recommended that application team need to look for the change in parameter  “degree => DBMS_STATS.DEFAULT_DEGREE” to “degree =>8” of their stats gathering procedure.
Before this change, we need to deploy these changes on non production and move to production with business approval & CM ticket.  So we couldn’t proceed these changes immediately.


2)     We will plan set the init parameter LARGE_POOL=300M on instance.
Unfortunately there is no memory on SGA components (MEMORY_TARGET and MEMORY_MAX_TARGET is fully allocated to other memory components). We don’t want to shrink the other memory components.
So we opened Emergency CM ticket for “To bounce the database and changing the init parameter value” during tech window.

 While tech window we increased the MEMORY_TARGET, MEMORY_MAX_TARGET and LARGE_POOL value and recycled the database.

Now this job was running fine and this job don’t use any parallel session (its ran single thread).
Sometimes Oracle 11g R1 optimizer behavior was very worst :-)
  .
Again there is a chance to happen the same issue. So we strongly recommended below things to application team.

“We need to control the parallelism on the proc or jobs when it is being configured. For small tables parallelism of 97 is too much and it can eat up most of the memory and CPU. So we recommended changing the “degree => DBMS_STATS.DEFAULT_DEGREE” to “degree =>8” of their stats gathering procedure”.

No comments:

ORA-01552: cannot use system rollback segment for non-system tablespace 'TEMP'

 ORA-01552: cannot use system rollback segment for non-system tablespace "string" Cause: Used the system rollback segment for non...