RAC Administration
Invoke installer: $GRID_HOME/crs/config/config.sh
Quick RAC Status Checks
As grid user:
Cluster Services on Node: crsctl status res -t -init Cluster (brief): crsctl check cluster -all Nodes: olsnodes ASM: srvctl status asm Listener: ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
As oracle user: srvctl status database -d [DBName]
SQL> SELECT inst_name FROM v$active_instances; SQL> SELECT instance_name,status,database_status,active_state FROM gv$instance ORDER BY instance_name;
Assumes you have installed the database product and created a database.
ASM
-- Display
- Log: $ORACLE_BASE/log/diag/asmcmd/user_grid/$HOSTNAME/alert/alert.log
- ps -ef|grep ASM
- srvctl config asm
- srvctl status asm
- srvctl status asm -n [NodeName]
- Space Usage
- grid> asmcmd -p lsdg
- Via SQL: asm.space_usage.sql
-- Show ASM Status COL free_mb FORMAT 999,999,999 COL total_mb FORMAT 999,999,999 SELECT name, state, total_mb, free_mb FROM v$asm_diskgroup; NAME STATE TOTAL_MB FREE_MB ------------------------------ ----------- ------------ ------------ FRA CONNECTED 3,145,726 2,461,594 DATA CONNECTED 9,437,178 5,820,119 GRID MOUNTED 488,280 488,226
Cluster (AKA Cluster Ready Services: CRS)
--Display
Log: | /u01/app/grid/diag/crs/$HOSTNAME/crs/trace/alert.log |
Name: | cemutlo -n |
OCR: | $ORACLE_HOME/bin/ocrcheck |
Services | crsctl status resource |
Services (On Node) | crsctl status res -t -init |
Services (deprecated) | crs_stat -t |
Status - All Nodes: | crsctl check cluster -all |
Status - Current Node: | crsctl check crs |
VIP: | crs_stat -t |grep vip |
Any operation that starts or stops the CRS must be run as root with the GRID environment set.
19.x Example:
export GRID_HOME=/u01/app/19.3.0.0.0/grid export PATH=$GRID_HOME/bin:$PATH cd $GRID_HOME/bin
12.x Example:
export GRID_HOME=/u01/app/12.1.0.2/grid export PATH=$GRID_HOME/bin:$PATH cd $GRID_HOME/bin
CRS: Start|Stop
All Nodes
crsctl stop cluster -all crsctl start cluster -all
To force: crsctl stop cluster -all -f
Single Node Method 1
crsctl stop crs crsctl start crs
- Must be logged into the node as root with GRID_HOME set.
- To force:
crsctl stop crs-f
Single Node Method 2
crsctl stop cluster -n hostname crsctl start cluster -n hostname
Alternatively: crsctl stop|start cluster -n hostname1 hostname2
Automatic Re-start: CRS
To enable\disable CRS to start on boot perform the following as root on the corresponding node\system.
These commands only affect the local server (so do on all nodes).
-- Enable
crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.
-- Disable
crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.
-- Status
crsctl config crs
CRS-4621: Oracle High Availability Services autostart is disabled.
Automatic Re-start: Database
Status: srvctl config database -d oradb|grep Management Disabled (MANUAL): srvctl modify database -d oradb -y manual Enabled (AUTOMATIC): srvctl modify database -d oradb -y automatic
Grid Services Example: Everything OK
crsctl status res -t -init
-------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE lnx01 Started,STABLE ora.cluster_interconnect.haip 1 ONLINE ONLINE lnx01 STABLE ora.crf 1 ONLINE ONLINE lnx01 STABLE ora.crsd 1 ONLINE ONLINE lnx01 STABLE ora.cssd 1 ONLINE ONLINE lnx01 STABLE ora.cssdmonitor 1 ONLINE ONLINE lnx01 STABLE ora.ctssd 1 ONLINE ONLINE lnx01 OBSERVER,STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.drivers.acfs 1 ONLINE ONLINE lnx01 STABLE ora.evmd 1 ONLINE ONLINE lnx01 STABLE ora.gipcd 1 ONLINE ONLINE lnx01 STABLE ora.gpnpd 1 ONLINE ONLINE lnx01 STABLE ora.mdnsd 1 ONLINE ONLINE lnx01 STABLE ora.storage 1 ONLINE ONLINE lnx01 STABLE --------------------------------------------------------------------------------
Database Instances
--Display
oracle> srvctl config database -d MyDBName oracle> srvctl status database -d MyDBName SELECT name FROM v$database; SELECT inst_name FROM v$active_instances; SELECT instance_name,status,database_status,active_state FROM gv$instance ORDER BY instance_name;
-- Start|Stop: All Instances
Run these as the oracle user.
srvctl start database -d MyDBName srvctl stop database -d MyDBName
-- Start|Stop: Single Instance
srvctl start instance -d MyDBName -i MyInstName srvctl stop instance -d MyDBName -i MyInstName
To force: srvctl stop instance -d MyDBName -i MyInstName -f
Interface Configuration: oifcfg
su - grid oifcfg iflist -p -n ens160 10.4.0.0 PRIVATE 255.255.255.0 ens192 10.15.224.0 PRIVATE 255.255.255.0 ens192 169.254.0.0 UNKNOWN 255.255.0.0 virbr0 192.168.122.0 PRIVATE 255.255.255.0 su - export GRID_HOME=/u01/app/12.1.0.2/grid export PATH=$GRID_HOME/bin:$PATH oifcfg getif ens160 10.4.0.0 global public ens192 10.15.224.0 global cluster_interconnect,asm
Deleting a Stored Interface
oifcfg delif -global qfe0/204.152.65.0
Listener
--Display
- srvctl status scan_listener
- srvctl status listener
- ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
- lsnrctl status
--Start|Stop
srvctl [start|stop] listener -n [NodeName]
srvctl start listener -n rac02
Logs
- Cluster: <GRID_BASE>/diag/crs/<Hostname>/crs/trace/alert.log
- Database: $ORACLE_BASE/diag/rdbms/<DBName>/<SID>/trace/alert_<SID>.log
- Listener: $ORACLE_BASE/diag/tnslsnr/<SID>/listener/alert/log.xml
- Tools: /u01/app/<gi_version>/grid/cfgtoollogs
Aliases
ORACLE_DB=oradb alias clog='tail -f /u01/app/grid/diag/crs/$HOSTNAME/crs/trace/alert.log' alias alog='tail -f $ORACLE_BASE/diag/rdbms/$ORACLE_DB\/$ORACLE_SID\/trace/alert_$ORACLE_SID\.log' alias llog='tail -f $ORACLE_BASE/diag/tnslsnr/$HOSTNAME/listener/alert/log.xml'
NTP
--Display Status cluvfy comp clocksync -n all -verbose
--Install\Configure
Configure NTP.
OCR
-- Location cat /etc/oracle/ocr.loc
-- Status grid> crsctl query css votedisk
Passwords, Changing
Phase 1: Linux OS
- passwd root
- passwd grid
- passwd oracle
Make sure to update any Linux mount points mounted using root account.
According to MOS, these OS passwords can be changed at any time even while the CRS and database are up. Per Doc ID 888315.1, there are no steps needed to be done regarding to RAC or SSH. SSH is not depending on the oracle password, it depends on the authentication keys created during the ssh configuration.
Phase 2a: Oracle SYS, SYSTEM, DBSNMP...
1. Backup old password-file in ASM.
Ex: +DATA/MYDB/PASSWORD/pwdmydb.256.905963595
2. If Data Guard (DG) used, disable log_archive_dest_state_n on Primary.
alter system set log_archive_dest_state_2=DEFER scope=both;
3. Change passwords on node 1.
ALTER USER sys IDENTIFIED BY "NewPass"; ALTER USER system IDENTIFIED BY "NewPass"; ALTER USER dbsnmp IDENTIFIED BY "NewPass";
4. Test via connecting to node 1 (not RAC SCAN).
connect sys@rac01/NewPass as sysdba select instance_name from v$instance;
5. Test other nodes then RAC SCAN.
connect sys@rac02/NewPass as sysdba ... sqlplus.exe sys/NewPass@MYDB as sysdba
Phase 2b: Data Guard Changes (if used)
-- Set Password-file on Standby Server(s). $orapwd file=$ORACLE_HOME/dbs/orapwMyDBName password=NewPass; -- Enable log_archive_dest_state_n and Perform Switch alter system set log_archive_dest_state_2=Enable scope=both; alter system switch logfile; -- Ensure Logs are Shipping Primary select max(sequence#) from v$archived_log; Standby select max(sequence#) from v$archived_log;
Phase 3: Oracle Scripts
RMAN script, exports etc. ...
Phase 4: Application Schema
• For each app schema account to change: ALTER USER <UserName> IDENTIFIED BY "NewPassword"; • Recreate any Database Links bound to changed schema accounts.
Server Pool
--Display
- srvctl config srvpool
Sessions
-- Instance Your Session is Logged Into
SELECT host_name FROM gv$instance WHERE instance_number=userenv('instance');
-- Instances All Sessions Logged Into
SELECT i.host_name, s.username FROM gv$session s JOIN gv$instance i on (i.inst_id=s.inst_id) WHERE username is not null;
Services - Display
- systemctl
VIP - Display
- crs_stat -t |grep vip
- srvctl status scan
APPENDIX
Aliases
Add to .bashrc on each node. Change as needed.
# Aliases - Common alias cl='crontab -l' alias l9='ls -alt | head -9' alias l20='ls -alt | head -20' alias l50='ls -alt | head -50' alias tf='date;ls -l|wc -l' # DBA Aliases alias cdbin='cd $ORACLE_HOME/bin;pwd' alias cdtrace='cd $ORACLE_BASE/diag/rdbms/$ORACLE_SID/$ORACLE_SID\_1/trace;pwd' alias cdnet='cd $ORACLE_HOME/network/admin;pwd' alias alog='tail -f $ORACLE_BASE/diag/rdbms/$ORACLE_SID/$ORACLE_SID\_1/trace/alert_$ORACLE_SID\_1.log' alias clog='tail -f /u01/app/grid/diag/crs/$HOSTNAME/crs/trace/alert.log' alias llog='tail -f /u01/app/grid/diag/tnslsnr/$HOSTNAME/listener/alert/log.xml' alias sqp='sqlplus sys/mypw@DB1 as sysdba' alias rmn='$ORACLE_HOME/bin/rman target / nocatalog'
SQLPLUS Usage
Get Service Name
cat $ORACLE_HOME/network/admin/tnsnames.ora DB1 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = DB1) ) )
Connect
export ORACLE_SID=DB1 sqlplus sys/MyPassword@DB1 as sysdba SQL> SELECT inst_name FROM v$active_instances; INST_NAME ------------------------------------------------------------ rac02:DB1_1 rac01:DB1_2
Database Express
https://<your_rac-scan OR localhost>:5500/em
Shutdown Log Snippets
Issued from rac01: srvctl stop database -d db1
rac01
--Alert Log
ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Archiving is disabled Mon Nov 21 16:01:57 2016 NOTE: force a map free for map id 2 Mon Nov 21 16:01:58 2016 Stopping background process VKTM Mon Nov 21 16:01:58 2016 ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Archiving is disabled Mon Nov 21 16:01:58 2016 NOTE: Shutting down MARK background process Mon Nov 21 16:02:01 2016 NOTE: ASMB releasing group resources due to shutdown NOTE: ASMB clearing idle groups before exit Stopping background process RBAL Mon Nov 21 16:02:02 2016 freeing rdom 0 Mon Nov 21 16:02:04 2016 Instance shutdown complete
--Cluster Log
2016-11-21 10:57:28.172 [OCSSD(2555)]CRS-1601: CSSD Reconfiguration complete. Active nodes are rac01 rac02 . 2016-11-21 10:58:03.740 [CRSD(5177)]CRS-2772: Server 'rac02' has been assigned to pool 'ora.db_pool'. 2016-11-21 11:08:20.732 [CRSD(5177)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.db1.db'. Details at (:CRSPE00163:) {2:8614:2} in /u01/app/grid/diag/crs/rac01/crs/trace/crsd.trc.
rac02
--Alert Log
Mon Nov 21 16:01:08 2016 minact-scn: Inst 2 is now the master inc#:4 mmon proc-id:3099 status:0x7 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.000b60c5 gcalc-scn:0x0000.000b60c5 minact-scn: Master returning as live inst:1 has inc# mismatch instinc:0 cur:4 errcnt:0 Mon Nov 21 16:02:03 2016 Reconfiguration started (old inc 4, new inc 6) List of instances (total 1) : 2 Dead instances (total 1) : 1 My inst 2 publish big name space - dead or down/up instance detected, invalidate domain 0 Global Resource Directory frozen * dead instance detected - domain 0 invalid = TRUE Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Mon Nov 21 16:02:03 2016 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Mon Nov 21 16:02:03 2016 minact-scn: master found reconf/inst-rec before recscn scan old-inc#:4 new-inc#:4 Mon Nov 21 16:02:03 2016 Post SMON to start 1st pass IR Mon Nov 21 16:02:03 2016 Instance recovery: looking for dead threads Mon Nov 21 16:02:03 2016 Submitted all GCS remote-cache requests Fix write in gcs resources Mon Nov 21 16:02:03 2016 Reconfiguration complete (total time 0.1 secs) Mon Nov 21 16:02:03 2016 Instance recovery: lock domain invalid but no dead threads * validated domain 0, flags = 0x0 Mon Nov 21 16:02:06 2016 minact-scn: master continuing after IR Mon Nov 21 16:03:03 2016 Decreasing number of real time LMS from 1 to 0
--Cluster Log
TBD
Socket File Admin (/var/tmp/.oracle)
Cleaning
su - cd /var/tmp/.oracle find ./ -name "s#*" -mtime +30 -exec rm {} \; find ./ -type s -mtime +30 -exec rm {} \;
Recreating
/bin/mkdir -p /var/tmp/.oracle /bin/chmod 01777 /var/tmp/ /bin/chown root /var/tmp/ /bin/chmod 01777 /var/tmp/.oracle /bin/chown root:oinstall /var/tmp/.oracle
For further info on this see:
- Doc ID 1322234.1
- Doc ID 399482.1
APPENDIX: meh
SCAN IP
The IP addresses must be on the same subnet as your default public network in the cluster. The name must be 15 characters or less in length, not including the domain, and it must be resolvable without the domain suffix (for example: “sales1-scan’ must be resolvable as opposed to “scan1-can.example.com”). The IPs must not be assigned to a network interface, since Oracle Clusterware will take care of it.
srvctl status nodeapps srvctl config nodeapps