Monitoring Oracle databases running on Amazon RDS
An Oracle RDS instance differs from an Amazon EC2 instance running Oracle database in a couple of aspects, including:
- RDS allows for SQL*Net access only. While it is probably backed by an EC2 instance internally, there is no visibility of this backend, including there is no SSH access. Means, no agents or other server-side installations.
- An RDS instance has restricted access, SYSTEM or SYS login is not possible. One login with DBA role is provided, but even this DBA role is a reduced version from the standard DBA role, e.g. it lacks ALTER SYSTEM and a couple of other privileges.
- A couple of pre-defined licensing options are available, including (currently) Standard Edition One, Standard Edition and Enterprise Edition with Partitioning, Oracle Label Security, OLAP, Data Mining, Oracle Database Vault and Real Application Testing options.
So RDS is more of a challenge for all kinds of tools than EC2, especially for monitoring software. At the time being, no Oracle Enterprise Manager functionality is offered for RDS. Quest Spotlight and Foglight are not an option as they require SYS access (at least for installation). Quest Foglight Performance Analysis requires SYS access and server-side installation of a collector agent.
Load testing is not so much of an issue. As stated above, RAT (Real Application Testing) is available. Quest Benchmark Factory works as well, providing among others trace file replay and various TPC tests.
Amazon CloudWatch
The good news is that Amazon offers its own interface for some important metrics, called CloudWatch. For RDS based Oracle instances this is already included in the pricing, delivering 11 different metrics at one-minute interval granularity:
- CPUUtilization: The percentage of CPU utilization of the underlying instance
- DatabaseConnections: The number of database connections in use
- FreeableMemory: Available RAM (Bytes) of the underlying instance
- FreeStorageSpace: Available storage space (Bytes)
- SwapUsage: Amount of swap space (bytes) used on the underlying instance
- ReadIOPS: Average number of disk read I/O operations per second
- WriteIOPS: Average number of disk write I/O operations per second
- ReadLatency: Average amount of time taken per disk read I/O operation
- WriteLatency: The average amount of time taken per disk write I/O operation
- ReadThroughput: Average number of bytes read from disk per second
- WriteThroughput Average number of bytes written to disk per second
The easiest way to access CloudWatch metrics is via AWS Management Console. After selecting a running RDS instance, it’s all on the Monitoring tab:
As soon as you have more than one RDS instance or want to monitor other parts of Amazon infrastructure as well or want to define alarms, the “Amazon CloudWatch” tab at the top of the console is your friend:
Alarms are threshold based and send out emails when the threshold is broken for a specified amount of time.
An alternative is to pull out metrics from CloudWatch and integrate them into your own monitoring solution. There is a couple of APIs for that, including some command line tools that can be downloaded at http://ec2-downloads.s3.amazonaws.com/CloudWatch-2010-08-01.zip
Also have a look at this link which is a good starting point for the CloudWatch command line tools as well as for other CloudWatch APIs: http://docs.amazonwebservices.com/AmazonCloudWatch/latest/DeveloperGuide/choosing_your_cloudwatch_interface.html#UsingTheCLI
I installed the command line tools on a small Linux VM. Getting a list of available RDS metrics is as easy as follows:
$ mon-list-metrics --namespace "AWS/RDS“ CPUUtilization AWS/RDS {DBInstanceIdentifier=myrds1} DatabaseConnections AWS/RDS {DBInstanceIdentifier=myrds1} FreeableMemory AWS/RDS {DBInstanceIdentifier=myrds1} ...
Getting the average, minimum and maximum CPU utilization for RDS instance “myrds1” between 11:00am and 11:20am in five minute intervals is done like this:
$ mon-get-stats CPUUtilization --namespace "AWS/RDS" --statistics "Average,Minimum,Maximum" --dimensions "DBInstanceIdentifier=myrds1" --start-time 2011-09-13T11:00:00 --end-time 2011-09-13T11:20:00 –-period 300 2011-09-13 11:00:00 1.196 0.94 1.29 Percent 2011-09-13 11:05:00 1.266 1.17 1.41 Percent 2011-09-13 11:10:00 1.08 0.7 1.53 Percent 2011-09-13 11:15:00 0.844 0.7 1.06 Percent
Wrap some awk or sed handling around this, put it into a scheduler of your choice, and you are ready to integrate with your own monitoring. I did this for Quest Foglight which allows defining your own dashboards, and it looks something like this:
I want to mention two challenges I faced:
- Each call of mon-get-stats has to be parameterized with an actual statistic (e.g. CPUUtilization) and a dimension (e.g. DBInstanceIdentifier=myrds1). Therefore it’s not possible to use one single call to read out all metrics. Instead, we need to issue one call for each RDS instance and for each statistic. I implemented this using a loop running essentially through the output of the first sample statement above (the mon-list-metrics).
- Each mon-get-stats call takes approx. 10 to 20 seconds, so running them synchronous in foreground is not a good idea. I got around this by running each mon-get-stats command inside the loop in background and have it write to an output file. While all the calls are running in background, a loop checks for running jobs and only moves on after all the background calls finished.
See below for my sample code (which outputs data as needed for Quest Foglight, but should be easily adjustable for different output formats required):
#!/bin/bash rdsinstancelist=`rds-describe-db-instances | awk '/DBINSTANCE/ {print $2}'` starttime=`date -u --date='5 minutes ago' +%Y-%m-%dT%H:%M:%SZ` endtime=`date -u +%Y-%m-%dT%H:%M:%SZ` for instancename in $rdsinstancelist do echo instancename.String.id=$instancename > results_$instancename.txt for metric in CPUUtilization DatabaseConnections FreeableMemory FreeStorageSpace ReadIOPS ReadLatency ReadThroughput SwapUsage WriteIOPS WriteLatency WriteThroughput do dimension="DBInstanceIdentifier=$instancename" echo $metric `$AWS_CLOUDWATCH_HOME/bin/mon-get-stats $metric --namespace "AWS/RDS" --statistics "Average" --dimensions "$dimension" --start-time $starttime --end-time $endtime -p 300` | awk '{print $1":"$5"="$4}' >> results_$instancename.txt & done done numjobs=`jobs -r| wc -l` while [ $numjobs -ne 0 ]; do sleep 5 numjobs=`jobs -r| wc -l` done # echo TABLE rdsperfdata echo START_SAMPLE_PERIOD for instancename in $rdsinstancelist do while read oneline do oneline=${oneline//:Seconds=/:second=} oneline=${oneline//:Count=/:count=} oneline=${oneline//:Bytes=/:byte=} oneline=${oneline//:Percent=/:percent=} oneline=${oneline//:Bytes\/Second=/:byte\/second=} oneline=${oneline//:Count\/Second=/:count\/second=} echo $oneline done < results_$instancename.txt echo NEXT_SAMPLE done echo END_SAMPLE_PERIOD echo END_TABLE
Have fun trying out!
Trackbacks & Pingbacks