Issue: Concurrent request running from a very long time
Before confirming a concurrent Request to long running we need to conclude/observe some of the things
•Check the Statistics,History Runs of that program & Based on this timelines we need to decide the long running Request.
•Check The load on DB node to ensure that High Resource Usage is not the Main cause.i troubleshoot
•Check DB locks in The Database To Ensure that This Session is not blocked by any other session.’
•Check wether Same Iteration Of concurrent program is running on Instance. like if concurrent request is running twise/trice
If we are Happy/resolved the above things and find that we dont have any problem Then we need to Digg
long running concurrent Request
Step 1 : Get Concurrent Request ID of long running concurrent request from Application(fronted).
Navigation : Application >> System administrator>> concurrent>request>
Step 2 : Find session associated with concurrent request .i.e SID
By using the above Query we can get sid,serial#,spid of the concurrent Request..
SELECT a.request_id, d.sid, d.serial# , c.SPID
FROM apps.fnd_concurrent_requests a,
apps.fnd_concurrent_processes b,
v$process c,
v$session d
WHERE a.controlling_manager = b.concurrent_process_id
AND c.pid = b.oracle_process_id
AND b.session_id=d.audsid
AND a.request_id = &Request_ID
AND a.phase_code = ‘R’;
In My case The Concurrent Request id is 678901
Output of above query returned
REQUEST_ID SID SERIAL# SPID
———————————————————–
678901 1973 89 1100
Here Conurrent Request id is 678901 & Asoociated Session is 1973.
SPID is the process that Running on DB node Because of This Request..
oracle@acer $ ps -ef|grep -i 1100
oracle1100 1 0 3:30:43 0:03 oracle (LOCAL=NO)
Note : before Using Oradebug make sure that spid shoud exist on DB node..
Step 3 : Enable event 10046 trace with level 12 using oradebug .
Syntax : oradebug setospid SPID
Here SPID is the process id that we are getting from step 2
Coming to My Case
SQL> oradebug setospid 1100
—
Oracle pid: 79, Unix process pid: 1100, image: oracle@Acer
—
Step 3.3 : Enable trace for 10046 event with level 12
SQL> oradebug event 10046 trace name context forever, level 12
Step 3.4 : Locate Trace file as
SQL>oradebug tracefile_name
/ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.trc
Wait for 30 Min to get the Trace file to find the Root Cause
Step 4 : Disable trace if u r thinking that trace is enough to find the root cause..
SQL> oradebug event 10046 trace name context off
Step 5 : Convert raw trace To Understandable tracefile By using tkprof/Traceanalyzer
tkprof ‘ /ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.trc’
’ /ORADB/acerdb/9.2.0/admin/DELL_ACER/udump/ACER_ora_1100.txt ’
explain=apps/[apps_passwd] prsela.
Step 6 : Check TKPROF out file to find root cause of slow concurrent request.
********************************************************************************
.
Application version: 11.5.10.2 multi-node (4 nodes)
Database: 11g RAC
Scenario:
Business critical concurrent request is running from more than 2 hours which was supposed to complete within 1 hour.
The standard manager and the manager responsible to execute the critical
concurrent request (custom manager) were overloaded with many requests.
The server on which these managers run had very high server load and
all CPU’s were utilized.
After diagnosis, we had to kill few requests which were running from
more than 2 hours and were creating load on the server. These requests
were in fact not performing anything and had gone zombie.
The server load came down and CPU’s available were set free to execute new requests/processes.
However the business critical request was still running and wasn’t
moving ahead at all. The database session associated was also not
executing anything.
Challenge:
Since this being a critical business request, we couldn’t terminate the
request coz we had already lost few hours running it. At the same time
the database session also wasn’t performing any execution and the whole
request was not going anywhere even though it showed running normal
status.
Trick:
Not recommended, however can be a life saver at times.
---- Forcibly change the status of the request to pending normal.
---- Kill the database session of the request.
Use the below scripts:
update applsys.fnd_concurrent_requests set phase_code = 'P',
status_code = 'I',
actual_start_date = null,
crm_release_date = null,
controlling_manager=null,
logfile_name = null,
logfile_node_name = null,
outfile_name = null,
outfile_node_name = null,
crm_tstmp = null
where request_id in ('request-id');
alter system kill session 'sid, serial#, @instance-id';
Hence a new database session will get created as soon as the old one is
killed and the request starts running. Since the server on which the
request is supposed to get processed also had no issues now, the request
went through very fast without any issues ;)
********************************************************************************