Scaling Oracle Databases on Amazon RDS
Easy and short-term scalability is one of the main arguments for cloud based solutions. If you can’t easily scale up and down, it’s not a cloud.
RDS (Relational Database Services) is Amazon Web Services’ DBaaS offering (Database as a Service), currently available for MySQL and Oracle databases.
I blogged about RDS in the past here and here.
Now, how does scaling work for an RDS based Oracle database?
Let’s start with scaling vertically: CPU and memory capacity are determined by the so-called “DB Instance Class”. E.g. a Large Instance (instance class db.m1.large) provides 2 cores with approx. 4.5 GHz in total and 7.5GB of memory. Here is an overview of instance classes available for RDS (ECU is defined by Amazon as the equivalent of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon Processor):
|Small||Large||High-Memory Extra Large||High-Memory Double Extra Large||High-Memory Quadruple Extra Large|
|Memory||1,7 GB||7,5 GB||17,1 GB||34 GB||68 GB|
|CPU||1 ECU||4 ECU||6.5 ECU||13 ECU||26 ECU|
|Platform||Linux 64-bit||Linux 64-bit||Linux 64-bit||Linux 64-bit||Linux 64-bit|
An official and up-to-date list of available instance classes can be found here.
If you look at your RDS instance in AWS Management Console, it shows you the current instance class as well as a “Modify” button at the top:
After clicking the Modify button it looks like this:
There are actually three ways available here for scaling up and down:
- Changing the “Allocated Storage” allows us to provide more storage to the database. In this case it’s a scale-up only. Currently, there seems to be no way to scale-down, i.e. free up storage from deleted rows, deleted tables etc.
- Changing the “DB Parameter Group”. This is not so obvious, but a DB Parameter Group is just a set of init.ora parameters, including db_cache_size, shared_pool_size, pga_aggregate_target, or – if you like automatic memory handling – 10g’s sga_target and 11g’s memory_target. So you could effectively resize the database memory by changing the DB Parameter group. Nevertheless, in practice there is not much reason for that. Because your DB instance is the only thing running (on the underlying EC2 instance), and Amazon pre-sets the memory related init.ora parameters to some default values depending on the used instance class, using nearly all available memory anyway.
- Changing the DB Instance Class. This changes the available CPU and memory resources according to the table above. BUT: Be aware that this change needs a small downtime. Therefore we have to pick one of two altenatives: Applying the change later or applying it immediately. Later means: during the next maintenance interval which is also displayed here and could be as much as one week away. Applying the change immediately requires us ticking the checkbox “Apply Immediately”, and then it is an immediate reboot taking a couple of minutes until our database is available again, with new memory and CPU resources.
I’m not quite sure why Amazon always needs a reboot here. It’s probably due to restrictions in the underlying EC2 infrastructure which is based on Xen virtual machines. In the VMware world it’s definitely possible to at least scale-up CPU and memory resources online for newer versions of Windows and Linux. See here for a good description how to do that with Linux VMs.
I would definitely consider it an enhancement request to Amazon RDS to enable true online scalability for databases.
Also, there is no way to scale I/O capacity which is one of the most important resources for databases. That would be another enhancement request from my point of view.
Scaling via command-line tools
All actions described can also be scripted and issued by command line. For this we have to first download the RDS CLI tools, which are available in their latest version here. And here you will find a good starting point for setting up the tools.
Once installed, just give it a try:
$ # Scale-up to High-Memory Extra Large Instance during next maintenance interval $ rds-modify-db-instance myrds1 --db-instance-class db.m2.xlarge $ # Scale-up to High-Memory Extra Large Instance immediately (reboot) $ rds-modify-db-instance myrds1 --db-instance-class db.m2.xlarge --apply-immediately $ # Immediately scale-down again $ rds-modify-db-instance myrds1 --db-instance-class db.m1.large --apply-immediately $ # Change storage to 150GB immediately $ rds-modify-db-instance myrds1 --allocated-storage=150 --apply-immediately
When it comes to scaling horizontally, we are talking about starting up additional or removing unneeded instances. This is a well-known technique for web servers and application servers where load balancers can handle the addition and removal of instances and make it transparent to the clients.
For databases, it’s a bit more challenging because databases are full of state. Just starting up a couple of database instances doesn’t provide any benefit as long as they are independent of each other.
One solution to this would of course be Oracle RAC. Currently, RAC is not available on Amazon Web Services, neither EC2 nor RDS, as there is no way for sharing storage between instances. Even if it would be available, it could provide scalability of CPU and memory resources only. No I/O scalability with RAC as there is always one single database with its storage underlying the different RAC instances.
Another way to scale CPU, memory AND I/O resources is already available for MySQL on RDS and may be available in the future for Oracle on RDS as well. This is called “Asynchronous Read Replicas”. By employing these, there is still one single instance for read-write access that is replicated into a number of read-only replicas. These replicas can be used for database-level load balancing, at least for read-only access. Currently, when trying this with an Oracle instance, we see an error message like the one in the screenshot below.
Nevertheless, there are hints that Oracle read replicas may be available in the future, possibly using Oracle Active Data Guard in the background.
Time will tell…