Friday, 18 May 2012

Oracle DBA Daily Checklist

A.  Verify all instances are up:-

Make sure the database is available.  Log into each instance and run daily reports or test scripts.  Some sites may wish to automate this.
Optional implementation: use Oracle Enterprise Manager's 'probe' event.

B.  Look for any new alert log entries:-

·        Connect to each managed system.
·        Use 'telnet' or comparable program.
·        For each managed instance, go to the background dump destination, usually $ORACLE_BASE/<SID>/bdump. Make sure to look under each managed `
·        At the prompt, use the Unix ‘tail’ command to see the  alert_<SID>.log, or otherwise examine the most recent entries in the file.
·        If any ORA-errors have appeared since the previous time you looked, note them in the Database Recovery Log and investigate each one. The recovery log is in <file>.

C.  Verify DBSNMP is running:-

1.   Log on to each managed machine to check for the 'dbsnmp' process. 

For Unix:  at the command line, type ps –ef | grep dbsnmp.  There should be two dbsnmp processes running.  If not, restart DBSNMP. (Some sites have this disabled on purpose; if this is the case, remove this item from your list, or change it to "verify that DBSNMP is NOT running".)
D.  Verify success of database backup
E.  Verify success of database archiving to tape
F.   Verify enough resources for acceptable performance

1.   Verify free space in tablespaces. 

For each instance, verify that enough free space exists in each tablespace to handle the day’s expected growth.  As of <date>, the minimum free space for <repeat for each tablespace>: [ < tablespace > is < amount > ].  When incoming data is stable, and average daily growth can be calculated, then the minimum free space should be at least <time to order, get, and install more disks> days’ data growth.

a)  Go to each instance, run free.sql to check free mb in tablespaces.

Compare to the minimum free MB for that tablespace.  Note any low-space conditions and correct.

b)  Go to each instance, run space.sql to check percentage free in tablespaces.

Compare to the minimum percent free for that tablespace.  Note any low-space conditions and correct.

2.   Verify rollback segment.

Status should be ONLINE, not OFFLINE or FULL, except in some cases you may have a special rollback segment for large batch jobs whose normal status is OFFLINE.

a)  Optional: each database may have a list of rollback segment names and their expected statuses.

b)  For current status of each ONLINE or FULL rollback segment (by ID not by name), query on V$ROLLSTAT.

c)   For storage parameters and names of ALL rollback segment, query on DBA_ROLLBACK_SEGS.  That view’s STATUS field is less accurate than V$ROLLSTAT, however, as it lacks the PENDING OFFLINE and FULL statuses, showing these as OFFLINE and ONLINE respectively.

3.   Identify bad growth projections.

Look for segments in the database that are running out of resources (e.g. extents) or growing at an excessive rate.  The storage parameters of these segments may need to be adjusted.  For example, if any object reached 200 as the number of current extents, AND it's an object that is supposed to get large, upgrade the max_extents to unlimited.

a)  To gather daily sizing information, run analyze5pct.sql.  If you are collecting nightly volumetrics, skip this step.

b)  To check current extents, run nr_extents.sql

c)   Query current table sizing information

d)  Query current index sizing information

e)  Query growth trends

4.   Identify space-bound objects.

Space-bound objects’ next_extents are bigger than the largest extent that the tablespace can offer.  Space-bound objects can harm database operation.  If we get such object, first need to investigate the situation.  Then we can use ALTER TABLESPACE <tablespace> COALESCE.  Or add another datafile.

a)  Run spacebound.sql.  If all is well, zero rows will be returned.

5.   Processes to review contention for CPU, memory, network or disk resources.

a)  To check CPU utilization, go to x:\web\phase2\default.htm =>system metrics=>CPU utilization page.  400 is the maximum CPU utilization because there are 4 CPUs on phxdev and phxprd machine.  We need to investigate if CPU utilization keeps above 350 for a while.

G.  Copy Archived Logs to Standby Database and Roll Forward

If you have a Standby Database, copy the appropriate Archived Logs to the expected location on the standby machine and apply those logs (roll forward the changes) to the standby database.  This keeps the standby database up-to-date.

The copying of logs, the applying of them, or both, can in some cases be automated.  If you have automated them, then your daily task should be to confirm that this happened correctly each day.

H.  Read DBA manuals for one hour

Nothing is more valuable in the long run than that the DBA be as widely experienced, and as widely read, as possible.  Readings should include DBA manuals, trade journals, and possibly newsgroups or mailing lists.

Scripts Daily Procedures:-

1.   Free.sql

-- free.sql
--   To verify free space in tablespaces
--   Minimum amount of free space
--   document your thresholds:
--   <tablespace_name> = <amount> m

SELECT  tablespace_name, sum ( blocks ) as free_blk , trunc ( sum ( bytes ) / (1024*1024) ) as free_m
, max ( bytes ) / (1024) as big_chunk_k, count (*) as num_chunks
FROM dba_free_space
GROUP BY tablespace_name

2.   Space.sql

-- space.sql
-- To check free, pct_free, and allocated space within a tablespace
-- 05/18/2012

SELECT tablespace_name, largest_free_chunk
     , nr_free_chunks, sum_alloc_blocks, sum_free_blocks
     , to_char(100*sum_free_blocks/sum_alloc_blocks, '09.99') || '%'
       AS pct_free
FROM ( SELECT tablespace_name
            , sum(blocks) AS sum_alloc_blocks
       FROM dba_data_files
       GROUP BY tablespace_name
   , ( SELECT tablespace_name AS fs_ts_name
            , max(blocks) AS largest_free_chunk
            , count(blocks) AS nr_free_chunks
            , sum(blocks) AS sum_free_blocks
               FROM dba_free_space
               GROUP BY tablespace_name )
WHERE tablespace_name = fs_ts_name

3.   analyze5pct.sql

-- analyze5pct.sql
-- To analyze tables and indexes quickly, using a 5% sample size
-- (do not use this script if you are performing the overnight
-- collection of volumetric data)
-- 05/18/2012

   dbms_utility.analyze_schema ( '&OWNER', 'ESTIMATE', NULL, 5 ) ;

4.   nr_extents.sql

-- nr_extents.sql
-- To find out any object reaching <threshold>
-- extents, and manually upgrade it to allow unlimited
-- max_extents (thus only objects we *expect* to be big
-- are allowed to become big)
-- 05/18/2012

SELECT e.owner, e.segment_type , e.segment_name , count(*) as nr_extents , s.max_extents
, to_char ( sum ( e.bytes ) / ( 1024 * 1024 ) , '999,999.90') as MB
FROM dba_extents e , dba_segments s
WHERE e.segment_name = s.segment_name
GROUP BY e.owner, e.segment_type , e.segment_name , s.max_extents
    OR ( ( s.max_extents - count(*) ) < &&THRESHOLD )
ORDER BY count(*) desc

5.   spacebound.sql

-- spacebound.sql
-- To identify space-bound objects.  If all is well, no rows are returned.
-- If any space-bound objects are found, look at value of NEXT extent
-- size to figure out what happened.
-- Then use coalesce (alter tablespace <foo> coalesce;). 
-- Lastly, add another datafile to the tablespace if needed.
-- 05/18/2012

SELECT a.table_name, a.next_extent, a.tablespace_name
FROM all_tables a,
    ( SELECT tablespace_name, max(bytes) as big_chunk
      FROM dba_free_space
      GROUP BY tablespace_name ) f
WHERE f.tablespace_name = a.tablespace_name
                AND a.next_extent > f.big_chunk

Nightly Procedures:-
Most production databases (and many development and test databases) will benefit from having certain nightly batch processes run.

A.  Collect volumetric data

This example collects table row counts.  This can easily be extended to other objects such as indexes, and other data such as average row sizes. 

1.   Analyze Schema and Collect Data.

The idea here is to use the more time consuming and more accurate ANALYZE COMPUTE command and save the results, which show up in the data dictionary, to a more permanent store.

a)  If you havent' yet, create the volumetrics table with mk_volfact.sql

b)  To gather nightly sizing information, run analyze_comp.sql.

c)   To collect the resulting statistics, run pop_vol.sql

d)  Examine the data at your leisure, probably weekly or monthly.

I use MS Excel and an ODBC connection to examine and graph data growth.

 Scripts Nightly Procedures:-

1.   mk_volfact.sql

-- mk_volfact.sql (only run this once to set it up; do not run it nightly!)
-- -- Table UTL_VOL_FACTS

CREATE TABLE utl_vol_facts
  table_name                 VARCHAR2(30),
  num_rows                   NUMBER,
  meas_dt                    DATE
      INITIAL     128k
      NEXT        128k
      MAXEXTENTS  unlimited

-- Public Synonym

CREATE PUBLIC SYNONYM utl_vol_facts FOR &OWNER..utl_vol_facts

-- Grants for UTL_VOL_FACTS

GRANT SELECT ON utl_vol_facts TO public

2.   analyze_comp.sql

-- analyze_comp.sql
   sys.dbms_utility.analyze_schema ( '&OWNER','COMPUTE');

3.   pop_vol.sql

-- pop_vol.sql
insert into utl_vol_facts
select table_name
     , NVL ( num_rows, 0) as num_rows
     , trunc ( last_analyzed ) as meas_dt
from all_tables           -- or just user_tables
where owner in ('&OWNER') -- or a comma-separated list of owners

No comments: