Health Checks
Daily
- Are the OS and system hardware functioning OK (SAN/RAID, CPU, Network, RAM)?
- Is the database up? Was it down for any reason in the last 24 hours?
- Can client apps connect to the database?
- When was the last RMAN incremental backup run and what were the results?
- Are there any database errors that you need to know about?
- Alert Log
SQL To determine Alert Log location:SELECT value FROM v$diag_info WHERE name='Diag Trace';
Common Path:alias alog='tail -f /u01/app/oracle/diag/rdbms/mydb/mydb/trace/alert_mydb.log'
%ORACLE_BASE%\diag\rdbms\{DBNAME}\{DBNAME}\trace\alert_{DBNAME}.log
- Listener Log
To determine location:Common Path:OS> lsnrctl status
%ORACLE_BASE%\diag\tnslsnr\{DBNAME}\listener\trace\listener.log
- Alert Log
- Is there ample disk space for the datafiles, archive logs, export files and RMAN files?
- Did the daily scheduled jobs run OK?
- Review any security provided monitoring reports for inappropriate changes (Tripwire etc.).
IMPORTANT: Never open a live listener log file from an editor!
Use tail -f or copy it first and view the copy.
Weekly
- When was the last RMAN full backup run and what were the results?
- When was the last RMAN database validation run and what were the results?
- When was the last full database export and what were the results?
- How often are redo log switches occurring?
- Are the number of connections increasing? If so by how much?
- Are all the blocks in your datafiles OK (not just the ones with data)?
- Is any partition maintenance required (extending ranges etc.)?
- Does any of your documentation need to be updated?
- Did the weekly scheduled jobs run OK?
Monthly
- Are all your database patches up-to-date?
- Are your indexes OK or do they need to be rebuilt?
- Which tablespaces are growing, and by how much?
- Is the size of any schema increasing? If so by how much?
- Is your SGA and PGA usage increasing? If so by how much?
- What SQL is being run most often and can it be made more efficient?
- Is there a significant increase in DML (INSERTS, UPDATES and DELETES)?
- Are the segments in your tablespaces OK or do they need to be optimized?
- Can the maintenance of any the databases be simplified?
- Did the monthly scheduled jobs run OK?
Appendix: Using SQL to Check Alert Log (11g and later)
SELECT inst_id, originating_timestamp, message_text FROM TABLE(gv$(cursor(SELECT inst_id, originating_timestamp, message_text FROM v$diag_alert_ext WHERE originating_timestamp > (sysdate - 1) AND message_text LIKE '%ORA-%'))) ORDER BY inst_id, originating_timestamp;