Monitoring  Underutilized Storage Resources on AWS

Monitoring Underutilized Storage Resources on AWS

When cloud professionals embark on a journey to fish out underutilized resources that may be driving costs up, they rarely pay attention to doing some cost optimization in the direction of storage resources and often focus solely on optimizing their compute resources. In this article, we will go through some tools and strategies you can leverage in monitoring your AWS storage resources. Before moving on to that main event, let's start by talking briefly about the different storage types available on AWS. If you are ready to roll, let's go!!

Storage Types

When it comes storage, AWS has a wide array of services you can choose from. You will agree with me that having these many options can add some level of confusion to your decision making process especially when you don't have an understanding of what the options are and which ones are suitable for what use case. To provide you with guidance for when you have to pick a storage service on AWS, let's talk about some of the storage types available.

On AWS, storage is primarily divided into three categories depending on the type of data you intend to store. These categories are: Block storage, Object storage and File storage. We will go over them one after the other, exploring examples of each as we go.

Block Storage

To put it simply, a block storage device is a type of storage device that stores data in fixed-size chunks called blocks. The size of each block is dependent on the amount of data the device can read or write in a single input/output (IO) request. So when you want to store data on a block storage device and the size of the data surpasses the size of a single block of data that the device can read or write in single I/O request, the data is broken down into equal-size chunks before it is stored on the underlying storage device. As it is always important to understand the Why behind actions, let me tell you the performance benefit of block storage devices handling data in the manner that they do.

When data is broken down into blocks, it allows for fast access and retrieval of the data. In addition to fast access, when data is on a block storage device and changes are made to the data, only the blocks affected by the change are re-written. All other blocks remain unchanged which helps to further enhance performance and speed. In AWS, the block storage options include Elastic Block Storage (EBS) volumes and Instance Store volumes. Check out this article I wrote to learn more about EBS and Instance Store Volumes.

Object Storage

With object storage, data is not broken down into fixed-sized chunks as is the case with block storage. In object storage, data (files) are stored as single objects no matter their size. This kind of storage is suitable for huge amounts of unstructured data. The object storage service of AWS is S3. With all data being stored as single objects, when some part of that object is updated, the entire object has to be rewritten. You can access data stored in S3 via HTTP, HTTPS or APIs through the AWS CLI or SDK. Some pros that come with using S3 are: it is highly available, tremendously durable, low cost and can scale infinitely not forgetting the fact that you can replicate your data in the same or across regions for disaster recovery purposes. Check out this article I wrote on S3 to learn more about object storage.

File Storage

File Storage is fundamentally an abstraction of block storage using a file system such as NFS (Network File System) and SMB (Server Message Block). With File storage, the hierarchy of files is maintained with the use of folders and subfolders. The main file storage services of AWS are Amazon EFS and Amazon FSx. File storage is the most commonly used storage type for network shared file systems.

Monitoring Underutilized Storage Resources

The opening sentence of this paragraph is a lamentation so to speak on how storage resources are seldom considered when organizations and individuals take cost optimization actions. Even though they often fail to do this, it is just as important to pick the right storage option for your use case and also provision them appropriately. You can right size your storage resources by monitoring, modifying and even deleting those that are underutilized. Let's examine some of the ways in which you can monitor your storage resources.

Amazon Cloudwatch

Cloudwatch provides out-of-box metrics for monitoring storage services such as S3, DynamoDB, EBS and more. For EBS volumes you can use a metric such as, VolumeIdleTime which specifies the number of seconds there are no read or write requests to the volume within a given time period. With the information that Cloudwatch provides through this metric, you can decide on the action you want to take to manage the underutilized volume. In addition to the metrics that CloudWatch ships with for EBS volumes for example, you can create custom metrics to do things like find under provisioned or over provisioned volumes.

For S3 buckets, you can use the BucketSizeByte CloudWatch metric which gives you the size of your bucket in bytes. This comes in handy if you have stray S3 buckets that aren’t holding much data. Using this metric, you can quickly find and clean up those buckets.

S3 Object Logs & S3 Analytics

With S3 you can use S3 object access logs as well. These will help you track requests that are made to your bucket. Using this, you can find buckets that aren’t accessed frequently, and then determine if you still need the data in that bucket, or if you can move it to a lower cost storage tier or delete it. This is a manual process of determining access patterns. You make use of S3 Analytics if you are interested in a service that provides an automated procedure.

S3 Analytics can help you determine when to transition data to a different storage class. Using the analytics provided by this service, you can then leverage S3 lifecycle configurations to move data to lower cost storage tiers or delete it, ultimately reducing your spend over time. You can also optionally use the S3 Intelligent-tiering class to analyze when to move your data and automate the movement of the data for you. This is best for data that has unpredictable storage patterns.

Compute Optimizer and Trusted Advisor

To monitor for situations such as under-provisioned or over-provisioned EBS volumes, you can also make use of Compute Optimizer and Trusted Advisor for an easier and more automated experience. Compute Optimizer will make throughput and IOPS recommendations for General Purpose SSD volumes and IOPs recommendations for Provisioned IOPs volumes. However, it will identify a list of optimal EBS volume configurations that provide cost savings and potentially better performance. With Trusted Advisor, you can identify a list of underutilized EBS volumes. Trusted Advisor also ingests data from Compute Optimizer to identify volumes that may be over-provisioned as well.

Conclusion

As a self appointed disciple preaching the gospel of optimizing AWS resources for better cost saving and performance, I hope you have taken a lesson or two from this article to implement in your resources monitoring and optimizing strategies. There are services such as CloudWatch, Trusted Advisor, Compute Optimizer, S3 Analytics and much more for you to add to your bag of tools. To make sure you don't overwhelm yourself, learn more about each service you intend to make use of, start small and then move up from there. Good luck in your cloud endeavors.