Sujeet APPS DBA: Concurrent request running from a very long time

Issue: Concurrent request running from a very long time

Before confirming a concurrent Request to long running we need to conclude/observe some of the things

•Check the Statistics,History Runs of that program & Based on this timelines we need to decide the long running Request.
•Check The load on DB node to ensure that High Resource Usage is not the Main cause.i troubleshoot
•Check DB locks in The Database To Ensure that This Session is not blocked by any other session.’
•Check wether Same Iteration Of concurrent program is running on Instance. like if concurrent request is running twise/trice
If we are Happy/resolved the above things and find that we dont have any problem Then we need to Digg

long running concurrent Request

Step 1 : Get Concurrent Request ID of long running concurrent request from Application(fronted).

Navigation : Application >> System administrator>> concurrent>request>

Step 2 : Find session associated with concurrent request .i.e SID

By using the above Query we can get sid,serial#,spid of the concurrent Request..

SELECT a.request_id, d.sid, d.serial# , c.SPID
FROM apps.fnd_concurrent_requests a,
apps.fnd_concurrent_processes b,
v$process c,
v$session d
WHERE a.controlling_manager = b.concurrent_process_id
AND c.pid = b.oracle_process_id
AND b.session_id=d.audsid
AND a.request_id = &Request_ID
AND a.phase_code = ‘R’;

In My case The Concurrent Request id is 678901

Output of above query returned

REQUEST_ID SID SERIAL# SPID
———————————————————–
678901 1973 89 1100
Here Conurrent Request id is 678901 & Asoociated Session is 1973.

SPID is the process that Running on DB node Because of This Request..

oracle@acer $ ps -ef|grep -i 1100

oracle1100 1 0 3:30:43 0:03 oracle (LOCAL=NO)

Note : before Using Oradebug make sure that spid shoud exist on DB node..

Step 3 : Enable event 10046 trace with level 12 using oradebug .

Syntax : oradebug setospid SPID

Here SPID is the process id that we are getting from step 2

Coming to My Case
SQL> oradebug setospid 1100
—
Oracle pid: 79, Unix process pid: 1100, image: oracle@Acer
—

Step 3.3 : Enable trace for 10046 event with level 12
SQL> oradebug event 10046 trace name context forever, level 12

Step 3.4 : Locate Trace file as
SQL>oradebug tracefile_name

/ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.trc

Wait for 30 Min to get the Trace file to find the Root Cause

Step 4 : Disable trace if u r thinking that trace is enough to find the root cause..

SQL> oradebug event 10046 trace name context off

Step 5 : Convert raw trace To Understandable tracefile By using tkprof/Traceanalyzer

tkprof ‘ /ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.trc’ ’ /ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.txt ’ explain=apps/[apps_passwd] prsela.

Step 6 : Check TKPROF out file to find root cause of slow concurrent request.

********************************************************************************
.
Application version: 11.5.10.2 multi-node (4 nodes)
Database: 11g RAC

Scenario:

Business critical concurrent request is running from more than 2 hours which was supposed to complete within 1 hour.
The standard manager and the manager responsible to execute the critical concurrent request (custom manager) were overloaded with many requests. The server on which these managers run had very high server load and all CPU’s were utilized.

After diagnosis, we had to kill few requests which were running from more than 2 hours and were creating load on the server. These requests were in fact not performing anything and had gone zombie.

The server load came down and CPU’s available were set free to execute new requests/processes.

However the business critical request was still running and wasn’t moving ahead at all. The database session associated was also not executing anything.

Challenge:

Since this being a critical business request, we couldn’t terminate the request coz we had already lost few hours running it. At the same time the database session also wasn’t performing any execution and the whole request was not going anywhere even though it showed running normal status.

Trick:

Not recommended, however can be a life saver at times.

---- Forcibly change the status of the request to pending normal.
---- Kill the database session of the request.

Use the below scripts:

update applsys.fnd_concurrent_requests set phase_code = 'P',
status_code = 'I',
actual_start_date = null,
crm_release_date = null,
controlling_manager=null,
logfile_name = null,
logfile_node_name = null,
outfile_name = null,
outfile_node_name = null,
crm_tstmp = null
where request_id in ('request-id');

alter system kill session 'sid, serial#, @instance-id';

Hence a new database session will get created as soon as the old one is killed and the request starts running. Since the server on which the request is supposed to get processed also had no issues now, the request went through very fast without any issues ;)

********************************************************************************

Sujeet APPS DBA

Concurrent request running from a very long time

Issue: Concurrent request running from a very long time

No comments:

ERROR 1396 (HY000): Operation ALTER USER failed for 'Mysql'@'%'