2-Node RAC (12.1.0.2)
TOC
Overview
The following instructions cover the creation of a 2-node RAC for learning and testing in a VirtualBox environment. This has beens tested with RHEL7,CentOS7 and Oracle Linux 7. dnsmasq is used to emulate a DNS server. Minimum 8gb RAM per node is recommended for this test environment.
I wrote this script to do many of the manual changes if you prefer.
VirtualBox Specs
- hostname: rac01
- pubnet: 192.168.56.71\255.255.255.0
- privnet: 192.168.10.1\255.255.255.0
- hostname: rac02
- pubnet: 192.168.56.72\255.255.255.0
- privnet: 192.168.10.2\255.255.255.0
ASM Disk Groups
For Virtualbox shared disks and the corresponding ASM disk groups.
- GRID: 8gb OCR, Voting Files, Container Database for GIMR.
- FRA: 8gb Fast Recovery Area (spfile, ctl, redo, arclogs...).
- DATA: 8gb Oracle database files.
Some find with modern great performing systems all that is needed is:
- GRID (8gb minimum)
- DATABASE (16gb minimum)
Procedure
VirtualBox: rac01
- Create 2 VMs? running with above specs. Make sure to create using interfaces for RAC?.
- Set /etc/hosts to include all RAC network points for this 2-node RAC environment.
- Start VM for rac01 and set the hostname to rac01 and restart.
- Create OS users and directories.
- Make OS changes for the GI?.
- Configure DNS.
- Configure dnsmasq.
- Configure NTP
- Restart OS and test networking.
- shutdown -r now
- Test via nslookup and ping.
VirtualBox: rac02
- Clone rac01 as rac02.
- [x] Reinitialize the MAC address of all network cards.
- Start VM for rac02 and set the hostname to rac02.
- From Linux change public IP (192.168.56.72) and private (192.168.10.2) IP.
Use the Linux GUI: Applications -> System Tools -> Settings -> Network - Configure DNS.
- Configure dnsmasq.
- Restart OS and test networking.
- shutdown -r now
- Test via nslookup, ping and ntp (ntpq -p).
Storage
- Shutdown both nodes.
- In VirtualBox create? shared disk(s) on rac01 and attached them to rac02.
- Start rac01 and create disk partition(s) at the OS level (fdisk).
- Configure Udev on rac01.
- Start rac02 and refresh the partition table (via partprobe) then configure Udev on rac02. Use the same file\values as on rac01.
- Ensure disks can be seen and correct privs from rac02.
ls -rlt /dev/sd?1 - Restart rac02.
GI Installation
From rac01 perform the GI Installation.
Database
- On rac01 run the Database Installation
- Create BASH Profile? for the oracle user.
- Create Database
Check Status
APPENDIX
Set Hostname (rac02 example)
- hostnamectl set-hostname rac02
- vi /etc/sysconfig/network
HOSTNAME=rac02
Configure DNS
Set IP to current system.
- chattr -i /etc/resolv.conf
- vi /etc/resolv.conf
Example entry line: nameserver 192.168.56.71 - chattr +i /etc/resolv.conf
Configure dnsmasq
Set IP to current system.
- cp /etc/dnsmasq.conf /etc/dnsmasq.conf.orig
- vi /etc/dnsmasq.conf
expand-hosts local=/localdomain/ listen-address=127.0.0.1 listen-address=192.168.56.71 bind-interfaces
Make dnsmasq Active
- systemctl enable dnsmasq.service
- service dnsmasq restart
Cluster Verification Script
/u01/orasw/grid/runcluvfy.sh stage -pre crsinst -n rac01,rac02 -fixup -verbose
/etc/hosts
#public: 192.168.56.71 rac01.localdomain rac01 192.168.56.72 rac02.localdomain rac02 #private: 192.168.10.1 rac01-priv.localdomain rac01-priv 192.168.10.2 rac02-priv.localdomain rac02-priv #virtual: 192.168.56.81 rac01-vip.localdomain rac01-vip 192.168.56.82 rac02-vip.localdomain rac02-vip #Scan: 192.168.56.91 rac-scan.localdomain rac-scan 192.168.56.92 rac-scan.localdomain rac-scan 192.168.56.93 rac-scan.localdomain rac-scan
Restart Networking:
- /etc/init.d/network stop
- /etc/init.d/network start
Check If All Services Started OK
- systemctl
Add a Node (rac03)
Configure Base System
- Clone or create a VM with the following specs:
- Linux OS Prep? changes made.
- [x] Reinitialize the MAC address of all network cards.
- Add new IPs to /etc/hosts
- Entries for: public, private and virtual (vip).
- Ensure other nodes have updated /etc/hosts.
- Start VM for rac03 and set the hostname to rac03.
- Change public IP (192.168.56.73) and private (192.168.10.3) IP (from LINUX GUI is fine).
- Configure DNS (and\or dnsmasq if use).
- Restart OS and test networking.
- shutdown -r now
- Test via nslookup, ping and ntp (ntpq -p).
Configure Storage
- Shutdown all nodes.
- In VirtualBox Attach? attached shared disk to rac03.
- Start rac01 then rac02.
Ensure cluster and database instances all working OK. - Start rac03 and configure Udev on rac03.
- Ensure disks can be seen and correct privs from rac03.
ls -rlt /dev/sd?1 - Restart rac03
Run Cluster Add Node Tool
- Ensure the new node system does not have already have a /u01/app/oraInventory directory.
- Run the below from rac01 as the grid user.
- Run addnode.sh
- cd /u01/app/12.1.0.2/grid/addnode
- ./addnode.sh "CLUSTER_NEW_NODES={rac03}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rac03-vip.localdomain}"
- GUI Cluster Add Node app will launch...
- Cluster Add Node Information
Notice the [x] Reuse private and public keys option should be selected.- Confirm the displayed values look OK for the node you are adding.
- Select SSH connectvity.
- Enter grid user OS Password then select Setup.
- Prerequisite Checks
Ensure everything looks OK. Correct anything listed that could be an issue. - Summary
Select Install. - Install Product
- Process runs that copies files, updates Cluster Inventory and installs GI on new node...
- Run root scripts when prompted.
- You can monitor details via: tail -f /u01/app/grid/diag/crs/rac03/crs/trace/alert.log
- Finish
Select Close.
Post GI Steps
Add Database Instance
Run as the oracle user.
- On rac01 run the Database Installation?
- On new node create BASH Profile? for the oracle user.
- Run dbca to Add Instance.
Delete a Node
- Run dbca to Delete Instance.
- Delete Cluster Entries
- Ensure node to delete is Unpinned Run from rac01: olsnodes -s -t rac01 Active Unpinned rac02 Active Unpinned rac03 Active Unpinned If the node is pinned, then run the crsctl unpin css commands. - Actions from rac03 as grid user -- Set GRID_HOME export GRID_HOME=/u01/app/12.1.0.2/grid export PATH=$GRID_HOME/bin:$PATH cd $GRID_HOME/bin -- updateNodeList cd $GRID_HOME/oui/bin ./runInstaller -updateNodeList ORACLE_HOME=$GRID_HOME "CLUSTER_NODES={rac03}" CRS=TRUE -silent -local -- detachHome ./runInstaller -detachHome ORACLE_HOME=$GRID_HOME -silent -local Manually delete any configuration files, as prompted by the installation utility. -- deinstall cd /tmp $GRID_HOME/deinstall/deinstall -local Caution: If you do not specify the -local flag, then the command removes the Oracle Grid Infrastructure home from every node in the cluster. - On All Other Nodes (rac01, rac02) as the grid user. Format: ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={remaining_nodes_list}" CRS=TRUE -silent olsnodes export GRID_HOME=/u01/app/12.1.0.2/grid export PATH=$GRID_HOME/bin:$PATH cd $GRID_HOME/oui/bin -- updateNodeList ./runInstaller -updateNodeList ORACLE_HOME=$GRID_HOME "CLUSTER_NODES={rac01, rac02}" CRS=TRUE -silent -- updateNodeList Again You must run this command a second time from the Oracle RAC home, where ORACLE_HOME=ORACLE__RAC_HOME and CRS=TRUE -silent is omitted from the syntax, as follows: Format ./runInstaller -updateNodeList ORACLE_HOME=ORACLE_HOME "CLUSTER_NODES={remaining_nodes_list}" ./runInstaller -updateNodeList ORACLE_HOME=$GRID_HOME "CLUSTER_NODES={rac01, rac02}" -- Delete Node As root: export GRID_HOME=/u01/app/12.1.0.2/grid export PATH=$GRID_HOME/bin:$PATH Format: crsctl delete node -n node_to_be_deleted crsctl delete node -n rac03 If you get CRS-4658: The clusterware stack on node rac03 is not completely down. Try powering down node to delete. If CRS-4660: Could not find node rac03 to delete. CRS-4000: Command Delete failed, or completed with errors. Check if node still visible: ./cluvfy stage -post nodedel -n rac03 -verbose If so: srvctl stop vip -node rac03 srvctl remove vip -node rac03 - Verify Cluster Nodes olsnodes - Post Removal Tasks -- Remove rac03 entries from /etc/hosts -- Update DNS (bounce dnsmasq if used). Notes - I noticed after removing a node that in starting the database instances they came up slow but ok (nor errors in alog) the very first time after I did this. I think ASM might have been busy...TBD.
Oracle's docs on deleting a node.
Under Development (notes)
Change Cardinality
If your RAC only starts\runs one node the reason may be the cardinality given in the DBCA. To change the cardinality run the below. This will output the server pools configured (db_pool): srvctl config srvpool
Next set all nodes to use the server pool (db_pool): srvctl modify srvpool -g db_pool -u 2
This can take several minutes to run.
To test run the below to see which instance runs on which node: srvctl status database -d db1
10:25 AM 11/1/2016 [grid@rac01 ~]$ srvctl status database -d db1 Instance DB1_2 is running on node rac01 Instance DB1_1 is not running on node rac02 10:30 AM 11/1/2016 [grid@rac01 ~]$ srvctl status database -d db1 Instance DB1_1 is running on node rac02 Instance DB1_2 is running on node rac01
Common Production Values
Linux OS Additional Disk Space (mount points, i.e. on each Linux system\VM): • /u01 (app, orasw…): 50 gigabytes • /u02 (rman, exports…): 1 terabyte Oracle ASM Disk Groups (1 disk for each) [Shared Disks] • GRID: 25 gb • FRA: 25 gb • DATA: 512 gb
We're using a real SAN here folks with Enterprise performance, i.e. not my first SAN needing dozens of itty-bitty disk partitions and non-scaleable.
SCAN IP
The IP addresses must be on the same subnet as your default public network in the cluster. The name must be 15 characters or less in length, not including the domain, and it must be resolvable without the domain suffix (for example: “sales1-scan’ must be resolvable as opposed to “scan1-can.example.com”). The IPs must not be assigned to a network interface, since Oracle Clusterware will take care of it.