How to Tune Elastic Resource Utilisation in AWS

Elastic resource utilisation in AWS helps businesses save money and maintain performance by automatically adjusting computing resources based on demand. Here's what you need to know:

Cost Savings: AWS Graviton instances deliver up to 40% better price performance, while EC2 Spot Instances can provide discounts of up to 90% for fault-tolerant workloads. Reserved Instances and Savings Plans offer predictable workload discounts of up to 72%.
Key Services: AWS Auto Scaling, ECS Auto Scaling, Elastic Load Balancing (ELB), and DynamoDB Auto Scaling dynamically manage resources across applications, databases, and storage.
Metrics to Monitor: Keep an eye on CPU utilisation, memory usage, network performance, and disk I/O to make informed scaling decisions.
Scaling Methods: Use target tracking policies for simplicity, step scaling for precision, and predictive scaling to prepare for recurring traffic spikes.
Cost Management: Combine Spot Instances with Reserved Instances, use S3 lifecycle policies to manage storage costs, and leverage AWS Compute Optimiser to right-size resources.
Real-World Applications: From handling e-commerce traffic surges to optimising media processing, elastic scaling supports diverse workloads efficiently.

Quick Overview of Benefits

Feature	Benefit
Graviton Instances	40% better price performance
Spot Instances	Up to 90% savings for flexible workloads
Auto Scaling	Dynamic resource adjustment
Predictive Scaling	Anticipates demand spikes
S3 Intelligent-Tiering	Automatic storage cost optimisation

By monitoring performance, setting up appropriate scaling policies, and using the right pricing models, you can save up to 70% on cloud costs while ensuring smooth operations. AWS's elastic tools are key to balancing performance and cost-effectiveness.

Getting the most out of AWS Auto Scaling | The Keys to AWS Optimization | S12 E7

AWS Auto Scaling

AWS Elastic Resource Basics

Learn the essentials of AWS elastic scaling services to create a responsive and cost-efficient infrastructure. Below, we’ll dive into the main AWS services, key metrics, and real-world examples that showcase elastic scaling in action.

Main AWS Services for Elastic Scaling

AWS offers a variety of tools to help you scale resources automatically based on your application's needs. AWS Auto Scaling is a centralised service that adjusts computing power and storage across resources like EC2, Spot Fleet, ECS, DynamoDB, and Aurora through a unified console [2]. For example, EC2 Auto Scaling ensures your application always has the right number of EC2 instances. It launches new instances during traffic spikes and terminates them when demand drops [2].

If you're working with containerised applications, ECS Auto Scaling uses CloudWatch metrics, such as CPU and memory usage, to manage scaling. Meanwhile, AWS Fargate eliminates the hassle of server management, letting you focus entirely on developing your application. As AWS Fargate puts it:

Focus on building your applications, letting me worry about managing the infrastructure for you [4].

Elastic Load Balancing (ELB) plays a key role by distributing incoming traffic evenly across instances to maintain performance [3]. For databases, RDS and DynamoDB Auto Scaling dynamically adjust storage and capacity to match demand [2].

Important Metrics for Resource Management

Elastic scaling hinges on monitoring specific metrics that guide scaling decisions. For instance:

CPU utilisation measures the percentage of CPU time used by EC2 instances, helping identify when to scale up or down [6].
Memory usage points to potential bottlenecks in memory-intensive workloads [5].
Network performance metrics like NetworkIn and NetworkOut track data transfer patterns by measuring bytes sent and received [6].
Disk I/O metrics - including DiskReadOps, DiskWriteOps, DiskReadBytes, and DiskWriteBytes - help monitor storage activity and plan capacity [5].

The table below highlights some of these key metrics:

Metric	Description	Unit	Key Statistics
CPU Utilisation	Percentage of physical CPU time used by instances	Percent	Average, Minimum, Maximum
DiskReadOps	Count of completed read operations from volumes	Count	Sum, Average, Maximum
DiskWriteOps	Count of completed write operations to volumes	Count	Sum, Average, Maximum
DiskReadBytes	Total bytes read from volumes	Bytes	Sum, Average, Maximum
DiskWriteBytes	Total bytes written to volumes	Bytes	Sum, Average, Maximum
NetworkIn	Bytes received on all network interfaces	Bytes	Sum, Average, Maximum
NetworkOut	Bytes sent on all network interfaces	Bytes	Sum, Average, Maximum

Amazon RDS sends metric data to CloudWatch every minute, retaining it for 15 days. Enhanced Monitoring metrics are stored in CloudWatch logs for 30 days by default, but this can be extended [5].

Common Elastic Scaling Scenarios

AWS elastic scaling services shine in a variety of real-world scenarios:

E-commerce platforms: Amazon SQS helps decouple processes like payments, inventory, emails, and shipping, allowing each to scale independently during traffic surges [9].
Web applications: Combining EC2 Auto Scaling, load balancing, dynamic scaling policies, caching, and database scaling ensures smooth performance and high availability during fluctuating demand [8].
Media streaming services: Amazon CloudFront caches content at edge locations, while EC2 Auto Scaling adjusts the capacity of origin servers to handle spikes in demand [8].
IoT applications: Services such as EC2 Auto Scaling, DynamoDB auto scaling, and container orchestration (via ECS or EKS) efficiently manage workloads that vary significantly over time [8].

In all these cases, AWS Auto Scaling adjusts resources automatically, ensuring steady performance while keeping costs low by charging only for the resources you use [7].

How to Configure Elastic Resources

Setting up elastic resources involves configuring scaling policies, selecting the right instances, and creating launch templates to meet demand while managing costs effectively.

Creating Auto Scaling Policies

Auto Scaling policies are the backbone of how your infrastructure adapts to varying demand. Among the available options, target tracking scaling policies are often the simplest and most effective for many scenarios. AWS highlights their benefits, saying:

We strongly recommend that you use target tracking scaling policies to scale on metrics like average CPU utilization or average request count per target. Metrics that decrease when capacity increases and increase when capacity decreases can be used to proportionally scale out or in the number of instances using target tracking. This helps ensure that Amazon EC2 Auto Scaling follows the demand curve for your applications closely. [10]

For example, you might set a target of 70% CPU utilisation. AWS will then automatically adjust the capacity to maintain that target, removing much of the guesswork from traditional scaling methods.

Step scaling policies offer more granular control, using predefined adjustments triggered by CloudWatch alarms. These adjustments fall into three categories:

Scaling Adjustment Type	Description	Example
`ChangeInCapacity`	Adjust capacity by a specific number	If current capacity is 3 and adjustment is 5, new capacity becomes 8
`ExactCapacity`	Set capacity to an exact number	If current capacity is 3 and adjustment is 5, new capacity becomes 5
`PercentChangeInCapacity`	Adjust capacity by a percentage	If current capacity is 10 and adjustment is 10%, new capacity becomes 11

To prevent unnecessary scaling, cooldown periods delay further changes until previous scaling actions take effect. Similarly, instance warmup periods ensure new instances are fully operational before being included in metrics. For most web applications, a 300-second cooldown is typical, but workloads with longer start-up times may need a longer period.

If your workload follows predictable patterns, predictive scaling can be a game-changer. It uses machine learning to forecast demand and adjust capacity ahead of time.

Selecting the Right Instance Types

Once scaling policies are in place, the next step is choosing instance types that balance performance and cost. Research shows that many instances are over-provisioned, and simply downgrading to a smaller instance can cut costs significantly [12].

Here are some common instance types to consider:

Burstable T-series instances: Perfect for applications with fluctuating CPU needs. They provide baseline performance but can handle occasional spikes, making them ideal for web servers, dev environments, or apps with sporadic high-CPU demands.
General-purpose M-series instances: These offer a balanced mix of compute, memory, and networking resources, making them suitable for most production workloads without specialised optimisation needs.
AWS Graviton-based instances: These instances are known for their cost and energy efficiency, making them a great choice for compatible applications.

When selecting an instance, think about your application's specific requirements. For single-threaded tasks, prioritise instances with high per-core performance. Multi-threaded workloads, on the other hand, benefit from instances with more cores. If your app is memory-intensive, look for instances with a high RAM-to-CPU ratio, and for storage-heavy tasks, opt for instances optimised for I/O performance.

To make the best choice, test your application on different instance types. Real-world testing often provides a clearer picture than theoretical specifications.

Setting Up Launch Templates

Launch templates simplify deployment and ensure consistency. They define key parameters for your instances and support versioning, making them a better option than launch configurations [11].

When creating launch templates, consider the following:

User data scripts: Use these to automate tasks like installing software, configuring services, or joining instances to your application cluster. Keep these scripts lightweight and use configuration management tools for more complex setups.
Security groups: Define separate security groups for different application tiers (e.g., web servers, app servers, databases) and follow the principle of least privilege to enhance security.
IAM roles: Attach roles to instances via the launch template to grant necessary AWS permissions without embedding credentials. Create specific roles for each instance type to avoid overly broad access.
Detailed monitoring: Enable this feature to get CloudWatch metrics at one-minute intervals instead of the default five minutes. While it costs more, the improved visibility is often worth it for production environments.
Instance metadata options: Adjust these settings to control how instances access metadata, enhancing security for cases where metadata access is unnecessary or should be restricted.
Versioning: Launch template versioning allows you to maintain multiple configurations. This makes it easy to roll back changes if needed. Always test new versions thoroughly before applying them to your Auto Scaling groups.

Monitoring and Adjusting Elastic Resources

After setting up elastic resources, keeping a close eye on their performance is essential to ensure both efficiency and cost management. AWS CloudWatch is your go-to tool for gaining visibility into your resources and making necessary tweaks to your scaling configurations.

Monitoring with AWS CloudWatch

CloudWatch

AWS CloudWatch gathers metrics, logs, and events from various sources, including AWS services, on-premises servers, and applications using OpenTelemetry [13]. By default, it reports metrics every five minutes, but you can enable detailed monitoring for updates every minute.

AWS CloudWatch is a tool for monitoring, analysing, and acting on key metrics within your AWS environment. [13]

Setting Up Alarms for Proactive Monitoring

CloudWatch alarms let you stay ahead of issues by notifying you when specific thresholds are crossed [13]. For instance, anomaly detection can create dynamic alarms that adjust to normal workload patterns, helping reduce false alerts [13]. These alarms provide actionable insights for fine-tuning scaling configurations and managing costs.

Custom Dashboards for Better Insights

Custom dashboards allow you to visualise key metrics in one place, offering a consolidated view of your service performance [13]. Grouping dashboards by application tier or business function makes it easier for teams to monitor and interpret the data.

Next, let’s look at how to identify and address performance bottlenecks effectively.

Finding and Fixing Performance Issues

Simulating virtual user loads is an effective way to test performance and identify bottlenecks [14]. CloudWatch metrics can highlight problems such as resource overuse, memory leaks, or backend delays [14].

When performance testing, testers, developers, or QA simulate a load of virtual users on the tested application. Our goal when performance testing is to observe the application's performance, assess whether we achieved the predefined SLAs, and identify performance bottlenecks that may be limiting the application's scalability and functionality. - Blazemeter [14]

Pinpointing Root Causes

When performance metrics show signs of trouble, deeper investigation is necessary. AWS X-Ray, a distributed tracing service, helps track individual requests to uncover the root cause of issues [14]. For ECS workloads, scaling decisions should focus on the resource that gets maxed out first during load testing - often CPU [15]. It’s best to avoid scaling solely based on response times, as this might overlook the actual resource constraint [15].

Automating Responses

CloudWatch integrates seamlessly with AWS services like Auto Scaling and Lambda, enabling automated actions when pre-set metric thresholds are reached [13]. For example, you can use CloudWatch with AWS Lambda or Systems Manager to automate fixes for performance issues [13]. Additionally, CloudWatch Logs Insights allows you to run SQL-like queries on log data, making it easier to spot patterns and troubleshoot problems during incidents [13].

Creating Custom Application Metrics

Standard metrics are helpful, but custom application metrics can offer more tailored insights that align closely with your business objectives. These metrics allow you to monitor data specific to your application, which CloudWatch doesn’t track by default [16]. Using the AWS SDK, CLI, or CloudWatch Agent, you can send custom metrics to CloudWatch for analysis [16].

Organising Custom Metrics

To keep things manageable, use consistent namespaces and limit dimensions when organising custom metrics [16]. Keep in mind that each destination can hold up to 2,000 metric definitions [17].

Metric Category	Example Metrics
User Activity	Daily active users, logins, customer sessions [16]
Business KPIs	Total orders, revenue per hour, abandoned carts [16]
Application Performance	API response time, error rates, real-time session counts [16]
Log-Based Metrics	Failed login attempts, HTTP 500 errors, database errors [16]

Best Practices for Implementation

To streamline operations, aggregate data and filter logs to send only the most essential metrics [16]. Automating this process reduces manual errors and ensures consistency [16]. For instance, web applications can benefit from tracking metrics like active users, API response times, and failed login attempts. Meanwhile, e-commerce platforms might focus on metrics like hourly order totals, inventory levels, and customer support request volumes [16].

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Schedule a 30 minutes, no-obligation call

Cost-Saving Methods with Elastic Resources

Keeping expenses in check starts with smart resource management. Combining pricing models, trimming storage costs, and avoiding waste through right-sizing can make a big difference.

Managing Instance Lifecycles

Using Spot Instances for Big Savings

Spot Instances are a game-changer for cutting costs on AWS, offering discounts of up to 90% compared to On-Demand pricing [20]. Take the NFL as an example - they save around £1.6 million each season by running 4,000 EC2 Spot Instances across more than 20 instance types [20].

Spot Instances let you take advantage of unused EC2 capacity in the AWS cloud and are available at up to a 90% discount compared to On-Demand prices.

Pranaya Anshu, EC2 PMM, and Sid Ambatipudi, EC2 Compute GTM Specialist [20]

To make the most of Spot Instances, flexibility is key. By allowing multiple instance types and Availability Zones, you can optimise your chances of securing affordable capacity [20]. Tools like attribute-based instance selection simplify this process by picking instances that meet your specific CPU, memory, and storage needs. The price-capacity-optimised strategy also ensures you get the best price while maintaining availability [20].

Allocation Strategy	Best For	Key Benefit
priceCapacityOptimised	Stateless applications, microservices, web apps	Balances cost and availability
capacityOptimised	CI/CD, image rendering, deep learning	Reduces interruptions for critical tasks
lowestPrice	Basic batch processing	Offers the lowest cost (with higher interruption risk)

Combining Reserved Instances with Spot Instances

For a balanced approach, pair Reserved Instances with Spot Instances in Auto Scaling groups. This creates a cost-efficient and reliable capacity base.

Scheduled Scaling for Test Environments

Switching off development and testing instances during downtime can save up to 65% [21]. AWS Instance Scheduler automates this process, ensuring non-production instances are powered down during nights, weekends, and holidays.

Reducing Storage Costs

Smarter Storage with Intelligent Tiering

Amazon S3 Intelligent-Tiering automatically moves your data between storage tiers based on how often it’s accessed, taking the guesswork out of managing storage costs [1].

Automated Lifecycle Policies

Set up S3 lifecycle policies to shift data through different storage classes as it ages. For instance, you can move files from Standard to Standard-Infrequent Access after 30 days, then to Glacier after 90 days, and finally to Glacier Deep Archive after a year. This ensures you’re not paying top-tier prices for rarely accessed data.

Clearing Out Unused Storage

Unused Elastic Block Store (EBS) volumes, often referred to as zombie storage, can quietly rack up charges. Regularly audit your EBS volumes and delete any obsolete snapshots or unattached storage [18].

AWS Compute Optimiser for Right-Sizing

AWS Compute Optimiser

To complement lifecycle and storage strategies, AWS Compute Optimiser offers insights for fine-tuning your resources.

Cutting Costs by Right-Sizing

AWS Compute Optimiser analyses your actual usage patterns - like CPU, memory, and network activity - to recommend better instance sizes [18][23]. Right-sizing can reduce costs by up to 25%, helping you avoid overprovisioning just in case [23].

Moving to Graviton Instances

AWS Graviton-powered instances can deliver up to 40% better price performance compared to x86-based processors [1]. With AWS Compute Optimiser, you can identify workloads that would benefit most from Graviton migration, making the switch a calculated decision.

Keeping Optimisation Ongoing

Use AWS Cost Explorer [19] and AWS Budgets [21] to monitor spending and spot idle resources like unused load balancers or unassociated Elastic IP addresses. Compute Savings Plans can further cut costs by up to 66% compared to On-Demand pricing [18]. Regularly reviewing your AWS setup ensures you’re always finding new ways to save [22].

Case Studies: Elastic Resource Tuning Examples

After discussing various tuning techniques, let’s dive into some real-world scenarios that highlight how elastic resource tuning can effectively cut costs and improve performance.

Retail: Managing Seasonal Traffic Surges

A specialty retailer faced a daunting challenge during Black Friday: traffic spikes that could overwhelm their systems within minutes. To tackle this, they utilised a combination of AWS elastic services to handle the surge seamlessly.

On Black Friday, AWS Auto Scaling expanded their capacity from 50 to 200 instances in just 15 minutes. They also implemented read replicas to boost database capacity by 5x, deployed CloudFront to reduce origin requests by 85%, and used Lambda functions to process inventory updates in parallel [26]. The results? Sub-second response times even during a staggering 1,200% spike in traffic, enabling the company to achieve record-breaking sales.

This strategy was critical, as research shows that 40% of online shoppers abandon a site if it takes more than three seconds to load [26]. With peak shopping seasons often driving traffic to levels 100% higher than usual [24], elastic tuning proved essential for maintaining a smooth customer experience.

Other industries also use elastic tuning to address their unique challenges, tailoring solutions to meet specific workload demands.

Media Processing: Cutting Costs for Batch Jobs

Media companies frequently deal with video encoding and processing workloads that require significant computing power for short bursts. To manage this efficiently, many turn to AWS Spot Instances, which offer temporary compute capacity at a fraction of On-Demand pricing.

Because video encoding tasks can tolerate interruptions and resume from checkpoints, these companies design workflows that automatically recover from Spot Instance interruptions. This approach not only preserves processing quality but also delivers significant cost savings. By building fault-tolerant architectures that break large jobs into smaller chunks, using automatic checkpointing, and spreading workloads across varied instance types and Availability Zones, media organisations optimise their costs. Many also combine Spot Instances with Reserved Instances to ensure they meet critical deadlines while maintaining a cost-effective balance.

Hokstad Consulting: A Case of Smart Elasticity

Hokstad Consulting

Hokstad Consulting provides a compelling example of how targeted resource tuning can lead to both cost savings and performance improvements. Working with a client, they achieved a 42% reduction in costs by refining Auto Scaling policies and resource configurations [27].

Their approach began with a detailed analysis of the client’s existing scaling triggers. By replacing generic CPU-based thresholds with custom metrics tailored to actual business demand, they avoided over-provisioning during minor traffic spikes. Additionally, they right-sized instances based on real usage patterns, introduced predictive scaling for recurring traffic trends, and combined different pricing models to balance cost and availability. This strategic alignment between resource allocation and business needs highlights the importance of scaling intelligently, not just automatically.

We're constantly meeting our customers where they are, and the support we receive from AWS along with these procurement efficiencies have empowered us to increase enterprise software sales in AWS Marketplace by 14-times year-over-year.

Lucie Buisson, Chief Product Officer, Contentsquare [25]

Hokstad Consulting’s methods typically result in a 30–50% reduction in cloud spending while simultaneously improving performance through smarter resource allocation and automation. These examples underscore how elastic tuning can transform resource management into a highly efficient and cost-effective process.

Conclusion: Getting the Most from AWS Elastic Resources

Fine-tuning elastic resources in AWS isn’t just about saving money - it’s about driving efficiency, boosting performance, and ensuring your systems run smoothly. By adopting the right strategies, businesses can cut cloud costs by as much as 70% while maintaining reliability and enhancing overall performance.

The methods outlined in this guide - like auto-scaling policies and predictive scaling - can lead to significant savings. For instance, predictable workloads can see cost reductions of up to 72%, Spot Instances can offer discounts of up to 90%, and AWS Graviton-powered instances deliver up to 40% better price performance [28][1].

The secret to success lies in staying proactive. AWS cost optimisation is an ongoing process that demands regular monitoring, auditing instance types, and implementing automated scaling based on real-world demand rather than assumptions. Every pound spent should deliver measurable value [30].

Combining multiple strategies often yields the best results. Techniques like right-sizing instances, consolidating underused resources, and managing data storage lifecycles effectively can lead to around 25% cost reductions while improving operations [29]. For organisations looking to take it further, partnering with experts can make a big difference.

Take Hokstad Consulting, for example. They specialise in cloud cost engineering and DevOps transformation, helping businesses achieve 30–50% reductions in cloud expenses through resource optimisation and automation. Their approach aligns perfectly with the principles of strategic resource tuning discussed here.

Whether you're navigating seasonal traffic surges, running batch jobs, or managing steady-state applications, elastic resource tuning is the key to balancing cost-efficiency with performance. By configuring resources smartly and keeping a close eye on usage, you can cut expenses, improve reliability, and create a more efficient cloud environment. Keep refining your approach, and you’ll uncover lasting benefits in both cost and performance.

FAQs

How do I choose the right AWS instance type for my application's needs?

To choose the right AWS instance type for your application, begin by evaluating its specific needs - think CPU, memory, storage, and networking requirements. AWS offers a variety of instance families designed for different workload types. For instance, compute-optimised instances are perfect for tasks that demand high processing power, while memory-optimised instances are better suited for applications with heavy RAM usage. If storage is your primary concern, storage-optimised instances might be the way to go.

Testing is a crucial step in this process. Running load tests across different instance types can help you gauge both performance and cost-effectiveness. AWS provides tools like the EC2 Instance Selector, which simplifies the process of narrowing down the best instance type based on your workload's unique characteristics.

By aligning your application's requirements with the appropriate instance type, you can strike the perfect balance between performance and cost.

What are the best practices for configuring AWS Auto Scaling to manage unexpected traffic surges effectively?

To manage unexpected traffic spikes effectively with AWS Auto Scaling, begin by implementing target tracking scaling policies. These policies automatically adjust the number of instances based on metrics like CPU usage or request count, ensuring your resources scale up or down smoothly as demand changes. For situations requiring more specific control, you can opt for step scaling policies. These let you set exact thresholds for scaling actions, making them particularly useful for handling sudden traffic surges.

Monitoring plays a vital role in this process. Keep track of metrics such as CPU utilisation, memory usage, and network traffic to determine when scaling is necessary. By setting up CloudWatch alarms, you can automate scaling actions whenever predefined thresholds are breached, allowing your application to respond dynamically to varying loads. Pair this with an Elastic Load Balancer to distribute incoming traffic evenly, boosting reliability and maintaining performance during high-demand periods.

How can I use AWS Spot Instances effectively while ensuring my applications remain reliable?

To make the most of AWS Spot Instances while keeping your applications reliable, it's essential to design them to handle interruptions effectively. Since Spot Instances can be terminated with just a two-minute warning, your workloads should either be stateless or have the ability to save their state and continue running elsewhere.

Incorporate Auto Scaling groups to adjust capacity dynamically. This helps maintain consistent performance, even when instances are interrupted. For better availability, consider using the capacity-optimised allocation strategy, which selects Spot Instances from less volatile pools.

Take advantage of AWS tools to streamline Spot Instance management. These tools can help distribute workloads efficiently and manage interruptions seamlessly. By focusing on building fault-tolerant architectures, you can enjoy significant cost savings without compromising the stability of your applications.