Blog

Calculating IOPS needed in AWS RDS

One of the most vicious and hard to detect issues in database performance deterioration is I/O. When the I/O of a database is lagging there are multiple and unpredicted issues occuring.

Some of the most common are:

  1. Increased amount of slow queries
  2. Write operations get very slow in an inexplicable way
  3. Because of the above two reasons, queries start piling up and database will eventually come to a halt

The immediate reaction of the person troubleshooting a growing list of pending queries, is to check the slow query log. If the slow log contains queries (probably will) then one will start investigating which of the queries was the cause of the problem.

However when machine I/O is the problem, it is likely that none of the queries is actually problematic.

This is the reason that I/O issues are very difficult to detect – infrastructure is the last thing to come to mind as the root of the problems.

Detecting I/O issues using AWS metrics

When using AWS RDS, one does not have traditional OS tools such as systat, iostat, dtstat or sar. The only tool to understand what is happening in RDS is cloudwatch metrics and the graphs provided.

Read and Write IOPS metrics

The IOPS cloudwatch metrics provide great insights into how much IOPS occur in your db.
You can view them by visiting cloudwatch, selecting RDS and then finding the ReadIOPS and WriteIOPS metrics for your database.

Once the graph shows up, select the 1 minute granularity and “average” from the dropdown.

By summing up the ReadIOPS and WriteIOPS you will see how much IOPS your operations consume.
read-write-iops

DiskQueueDepth Metric

The DiskQueueDepth metric provides the number of outstanding IOs (read/write requests) waiting to access the disk. If this metrics is frequently above 2, then you should expect sooner or later to face performance issues.

By using this metric you can immediately identify how many requests are waiting queued at your disk.

Do I have an I/O problem?

Using the above two graphs it is easy to identify if you are under-provisioned or over-provisioned in IOPS.

  1. If your DiskQueueDepth is consistsently between 0 and 0.5 you are over provisioned
  2. If your DiskQueueDepth is consistsently above 2 then you are under provisioned

How many IOPS do I need and how do I acquire them?

To see how many IOPS are needed to have a steady performance, use the ReadIOPS and WriteIOPS metrics and sum up the values. Choose a descent time interval or a typical day from a performance point of view and also remove outliers. Compare this value with the IOPS you have provisioned.

Once you calculate how many IOPS are needed, then you have two ways to acquire them.

The first is to purchase PIOPS, which is more reliable but a lot more costly. The second is to use a gp2 disk for your RDS instance, which provides 3 IOPS per GB of storage.

Lets take an example.

Assuming on a typical day we have an avearge of 400 ReadIOPS and 500 WriteIOPS, it means that our disk is consuming 900 IOPS. It therefore makes sense to acquire approximately that amount of IOPS.

Using the above two ways one has the below options:

  1. Purchase 900 PIOPS
  2. Use a 300GB ssd disk that comes with 3 IOPS / GB (therefore 300 GB * 3 IOPS = 900 IOPS).

PIOPS is considered more reliable however it is more costly.

Hope this guide provided some good undertsanding into how IOPS work in RDS.

For more information please also check http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Storage.html

Tags > ,

6 Comments

  • anup

    Thanks This explain not only AWS IOPS but overall concept if you have IO metrics in place .

    Reply
  • Nod

    What about the queue depth, how would you factor in GP2 requirement if queue depth is over 30 during your most intensive operations

    Reply
    • spyros

      Hi Gaurav,

      Can you explain your question a little clearer? Are you asking what happens if the queue depth is 30 for a long period of time?

      Reply
  • niraj vara

    A good explanation of RDS IOPS really helpful.

    Reply
  • Bharath

    *By summing up the ReadIOPS and WriteIOPS you will see how much IOPS your operations consume.*

    Can you explain this statement more?

    Reply
  • Thomas Anderson

    This doesn’t explain how the read/write IOPS translates into provisioned IOPS.
    If you already have provisioned IOPS and you are maxing out the read/write IOPS will total no more that the current provisioned IOPS.
    If you are provisioned and being throttled give yourself more then view these charts, if your total is less then you can reduce.

    Reply

Post A Reply to Nod Cancel Reply