Desaster Recovery in the cloud, Part 2
In Part 1 we talked about some general requirements for setting up a simple-to-use disaster recovery solution for Oracle databases.
Today, I built up the same inside the Amazon cloud, using different data centers in order to build up a robust DR solution.
I used Amazon’s EC2 (Elastic Compute Cloud) for that. If you are new to Amazon EC2, have a look into this earlier series of blog posts covering EC2 basics
Specifically I used two different availability zones inside Europe, called eu-west-1a and eu-west-1b. Think of these as two independent data centers run by Amazon, both located in Europe. The same scenario would work with e.g. eu-west-1a and us-east-1a, another Amazon data center located somewhere in Virginia, USA. Or we could even use different cloud providers to work around major issues concerning a single provider.
While I could have set up my own Oracle Home installation, for ease of use I preferred one of Oracle’s pre-built Amazon Machine Images. A list of these is available at http://aws.amazon.com/amis/Oracle, and I used the one with AMI ID ami-8d97bcf9, containing an Oracle Database 11g Release 2 (188.8.131.52) Standard Edition – 64 Bit on Oracle Linux 5.4.
Setup is as follows:
- Created an EC2 security group (firewall ruleset) opening up ports 22 (SSH), 1521 (Oracle listener) and 8081 (HTTP for Dbvisit’s Web Console).
- Fired up two instances of AMI ID ami-8d97bcf9: one called PrimaryDB in availability zone eu-west-1a, one called StandbyDB in availability zone eu-west-1b.
- Reserved two Elastic IP addresses and assigned one of them to each instance.
- Logged in as root to the primary instance (the only login possible initially). I just followed the wizard to build up a database. Alternatively, it’s possible to just cancel the wizard and use Oracle’s DBCA to build a database.
- Logged in as root to the standby instance, followed the wizard, but answered “N” when it asked “Would you like to create a database now?” The standby database will be built later as a copy of the primary.
- Adjusted listener.ora file on both instances to listen on the Public DNS address of the elastic IP address, e.g. ec2-176-34-178-144.eu-west-1.compute.amazonaws.com for the Elastic IP address 184.108.40.206.
- Downloaded and installed the Dbvisit Standby software for Redhat Linux into both instances, under /u01/app/dbvisit.
- Established SSH public/private keys so that the oracle OS user can connect from each server to the other server without interaction
Then I browsed to Dbvisit’s Web console on the primary server, logged in and started the creation process for a standby database. This is basically four steps:
- Setup and configure the standby environment: The wizard asks for everything around the planned standby environment. After finishing will all the questions, nothings yet happens on the database, but a Dbvisit database configuration file (DDC file) is built.
- A few manual modifications in this DDC file were needed because of some specialties in the EC2 cloud:
As the host name of the EC2 instance is not fixed, I instructed Dbvisit to use the Public DNS name instead of the regular host name. This is done by setting:
HOSTNAME_CMD = /u01/app/dbvisit/return_eip_hostname.sh
with return_eip_hostname.sh being a very small shell script containing:
echo <Public DNS name of this server>
As this Public DNS Name is only valid with all its components (Fully Qualified Domain Name), we need to set one more parameter:
USE_LONG_SERVER_NAME = Yes
- Then I created the standby database: One of the reasons I really like Dbvisit, it’s really just clicking a button, and it builds up the standby database!
- Schedule transfer and apply jobs, e.g. in a 5-minute interval let it transfer and apply archived logs.
At the time being I wasn’t able to get that working in EC2 as the Web GUI got confused because of the non-fixed hostnames.
I reported this issue to Dbvisit and already got feedback that they look into it and come back with a solution. As soon as this works, I will make an update to this post!
After setting up the service and the startup trigger as described in Part 1, let’s try to connect using this TNS entry which contains the Public DNS addresses of both instances:
CLOUDDB = (DESCRIPTION= (ADDRESS_LIST= (LOAD_BALANCE=OFF) (FAILOVER=ON) (ADDRESS=(PROTOCOL=TCP)( HOST=ec2-xxx.com)(PORT=1521)) (ADDRESS=(PROTOCOL=TCP)( HOST=ec2-yyy.com)(PORT=1521)) ) (CONNECT_DATA= (SERVICE_NAME=MYSERVICE) ) )
Have fun trying out!