How to Measure IOPS and Throughput in Cloud Storage | Hokstad Consulting

How to Measure IOPS and Throughput in Cloud Storage

How to Measure IOPS and Throughput in Cloud Storage

IOPS (Input/Output Operations Per Second) and throughput are key metrics for assessing cloud storage performance. IOPS measures the number of read/write operations per second, while throughput measures data transfer speed, typically in MB/s or GB/s. These metrics help you optimise storage for specific workloads, such as databases requiring high IOPS or media applications needing high throughput.

To measure these effectively:

  • Use tools like fio (Linux) or DiskSpd (Windows).
  • Match test parameters (e.g., block size, queue depth) to your application’s workload.
  • Monitor latency and performance during peak and low-usage periods.

Continuous monitoring and realistic benchmarks are critical for identifying bottlenecks and ensuring cost-effective storage configurations.

Storage Performance in 5 mins - IOPS, Latency & Throughput

Prerequisites and Setup for Benchmarking

Getting accurate results for IOPS and throughput testing starts with having the right tools and a clear plan. Proper preparation ensures that your benchmarking reflects actual performance, helping you avoid decisions based on misleading data.

Measurement Tools and Software

For reliable storage benchmarking, command-line tools are a solid choice. In Linux environments, fio (Flexible I/O Tester) stands out for its ability to fine-tune test parameters like block sizes, queue depths, and read/write patterns. On Windows, DiskSpd from Microsoft offers similar functionality with seamless integration into the operating system.

To complement these tools, native cloud monitoring services provide a useful performance baseline. These services continuously collect data, offering a snapshot of storage performance before running intensive benchmarks.

Make sure you have elevated privileges to access storage directly and bypass operating system caches. Verify that your account has the necessary permissions to avoid interruptions during testing.

Real-time monitoring tools like iostat and iotop can help identify bottlenecks in your system while the tests are running, giving you deeper insights into performance issues.

Once your tools are ready, ensure your tests mimic actual workload patterns for results that truly reflect your system's capabilities.

Know Your Workload Patterns

Understanding how your application behaves is essential for meaningful benchmarking. For example, database applications often involve many small, random read/write operations with high concurrency. On the other hand, media streaming services typically deal with larger, sequential data reads, which put different demands on storage systems.

Analyse how your storage is used, paying attention to factors like average file sizes, peak usage periods, and the balance between reads and writes. Documenting peak loads and concurrency levels will help you design tests that mirror your application's real-world demands.

Avoid relying on arbitrary figures when setting test parameters. Instead, use data that reflects your application's actual operational characteristics. For instance, if your application experiences traffic surges - such as a midday spike or overnight batch processing - your benchmarks should account for these variations. Testing only average loads might hide potential performance issues during peak times.

This groundwork lays the foundation for effective benchmarking tailored to UK-specific needs.

UK-Specific Setup Notes

When interpreting benchmark results, stick to metric units for consistency. Express storage throughput in MB/s or GB/s, and measure data volumes in gigabytes or terabytes. This approach aligns with UK technical documentation standards and simplifies comparisons across different tools and platforms.

Schedule your tests in GMT/BST to capture peak usage patterns relevant to UK customers. This is particularly important for applications serving a UK audience, as performance can vary depending on the time of day.

Include UK pricing in £, factoring in VAT, to accurately evaluate cost-performance trade-offs. This ensures your analysis reflects the true financial implications of your storage solutions.

Finally, make sure your testing environment adheres to UK data residency requirements and considers geographical latency. For organisations using hybrid cloud setups with UK-based infrastructure, remember that network latency between on-premises systems and cloud storage can significantly impact performance. Addressing these factors is key to obtaining meaningful and actionable benchmarking results.

Step-by-Step Measurement Guide

Once your tools are set up and workload patterns are identified, it's time to run benchmarks that reflect real-world scenarios. Adjust the test parameters to align with your application's specific requirements.

Setting Up Benchmark Tests

Start by configuring the test parameters in a fio job file to replicate your application's behaviour. Pay special attention to the following settings:

  • Block size: Match this to your application's I/O operations. For instance, database applications often use 4 KB or 8 KB blocks, while media-heavy applications might need 1 MB or larger blocks.
  • Queue depth: This determines concurrency. A queue depth of 1 reflects single-threaded operations, while values between 8 and 32 are typical for multi-threaded applications. However, very high queue depths might not reflect real-world performance.
  • Read/write mix: Define this based on your workload. For example, many web applications use a 70% read and 30% write pattern, while backup systems may lean towards nearly 100% write operations. Also, consider whether your workload involves random or sequential access. Databases usually generate random I/O, while log files are often sequential.

Allow a 30-second ramp-up period before running tests for 5–10 minutes to capture steady-state performance data.

If you're using DiskSpd in Windows environments, the same principles apply. Be sure to set a target file size that exceeds your system's cache (typically 2–3 times the available RAM) to ensure you're measuring actual storage performance rather than memory speed.

Running Tests and Gathering Data

Run your benchmarks systematically, starting with baseline tests during low-activity periods. Repeat each configuration multiple times to ensure consistency. Large variations between runs can signal issues like system bottlenecks or resource contention.

Use tools like iostat or iotop to monitor system activity during tests. Look out for CPU spikes or network saturation, as these could indicate non-storage-related bottlenecks.

Focus on recording both IOPS (Input/Output Operations Per Second) and throughput. These metrics often reveal the nature of your operations:

  • High IOPS with low throughput suggests small block operations.
  • High throughput with moderate IOPS points to larger block sizes.

The fio output will display these metrics under read and write sections, helping you interpret the results.

Storage benchmarking measures how your system handles different I/O patterns - reads, writes, random, sequential - to simulate real application behavior.
Simplyblock [1]

Don't overlook latency percentiles in your results. While average latency might seem fine, the 95th or 99th percentile values can expose performance issues during peak loads. Applications that rely on quick response times need consistently low latency, not just favourable averages.

It's also important to document your cloud provider's performance baselines. Many cloud storage solutions offer burst performance that exceeds sustained rates. Running longer tests will help you understand the true steady-state performance your application can depend on.

Practical Testing Considerations

Once you have initial test results, refine your benchmarks to ensure they reflect ongoing, real-world conditions.

  • Test at different times: Storage performance can vary depending on factors like provider load, network conditions, and regional demand. For UK-based deployments, test during peak business hours (9:00–17:00 GMT) and quieter periods to get a full picture.
  • Account for burst performance: Cloud storage services often allow brief periods of high performance that taper off. Extended tests can reveal the sustained performance levels needed for continuous workloads.
  • Simulate failures or stress scenarios: Run tests during maintenance windows or periods of high load to understand how performance holds up under pressure.
  • Consider geographical factors: For UK deployments, test from the regions where your applications run. Network latency between compute instances and storage can impact performance. For example, a solution performing well in London might behave differently when accessed from Edinburgh or Belfast.
  • Compare with vendor guidelines: Cloud providers often publish performance benchmarks for their services. Comparing your results with these can help identify misconfigurations or unrealistic expectations.

Finally, test with multiple concurrent workloads whenever possible. Applications rarely operate in isolation, so measuring performance while handling primary workloads alongside backups or monitoring systems gives a more realistic view of your system's capabilities. This approach ensures you're prepared for the demands of real-world operations.

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Reading and Understanding Results

To make sense of your benchmark data, it's essential to connect IOPS, throughput, and latency to the specific demands of your applications. Let’s break down these metrics and see how they shape your performance approach.

Key Metrics Explained

IOPS (Input/Output Operations Per Second) measures how many read and write operations a device can handle in a single second[2][3][4]. This metric is particularly useful for understanding how well a system handles frequent, small operations.

On the other hand, throughput focuses on the volume of data transferred over time. Typically measured in megabytes per second (MB/s)[2][3][4], it reflects how efficiently your storage system manages larger data transfers.

Together, IOPS and throughput offer a snapshot of storage performance. For instance:

  • High IOPS with lower throughput usually suggests workloads involving many small operations.
  • Higher throughput with moderate IOPS often points to workloads dominated by larger block transfers.

Latency is the third piece of the puzzle, representing the delay before a data transfer begins. It plays a critical role in system responsiveness and helps interpret how IOPS and throughput impact the user experience. By analysing these metrics together, you can fine-tune your cloud storage setup to meet your specific performance goals.

Best Practices for Continuous Monitoring

Cloud storage performance is always changing, making it crucial to keep a close eye on IOPS and throughput. Applications evolve, user demands shift, and cloud providers frequently update their infrastructure. To stay ahead of any performance issues and ensure your storage runs efficiently, a robust monitoring strategy is essential.

Testing with Production-Like Workloads

When testing, it’s important to mimic actual workload patterns as closely as possible. While synthetic benchmarks can offer baseline measurements, they often miss the complexity of real-world applications.

Set up test environments that reflect your production workload's unique characteristics. For example, if your e-commerce platform experiences peak traffic during evenings with heavy database activity, your tests should simulate this behaviour. Similarly, seasonal variations can significantly impact workloads. A financial services app may require far more IOPS during month-end reporting compared to normal operations. Scheduling quarterly benchmarks that account for such fluctuations ensures your storage setup remains effective throughout the year.

Avoid relying on unrealistic queue depths (like 32 or 64) that don’t match actual user performance. Once your production-like tests are in place, continuous monitoring will help track these metrics in real time.

Monitoring and Regular Performance Reviews

Realistic testing is just the beginning. Ongoing, automated monitoring of IOPS, throughput, and latency ensures you can spot performance dips before users are affected.

Start by establishing baseline performance metrics during initial tests. Any deviations beyond 15–20% of these baselines should be flagged for investigation. Monthly performance reviews can help identify trends, such as consistent latency spikes at specific times or gradually increasing IOPS demands - both of which could point to growing data volumes or user activity.

Document your findings and connect performance changes to specific events, such as application updates or infrastructure changes. This historical record is invaluable for diagnosing problems and planning future upgrades.

Set up alert thresholds based on your application’s performance needs. For instance, if sub-10ms latency is critical for a smooth user experience, configure alerts to trigger when latency consistently exceeds 8ms. These early warnings allow you to address issues before they escalate.

Working with Expert Consultants

Even with a solid testing and monitoring framework, expert advice can take your strategy further. Consultants bring a wealth of experience from working across various environments and can uncover optimisation opportunities that might not be obvious to your internal team. For example, Hokstad Consulting has a track record of reducing cloud costs by 30–50% while maintaining or improving performance.

Consultants can provide an objective assessment of your setup, offering tailored recommendations based on your workload patterns and budget. They’re particularly valuable when planning major changes, such as migrating to a new storage tier or adopting a hybrid cloud model. By conducting comprehensive performance tests and cost analyses, they ensure your decisions align with both technical needs and financial goals.

Engaging consultants for quarterly performance audits can be a smart move. These audits combine technical evaluations with cost-saving strategies, helping your storage configuration adapt to your business's evolving needs. Many consulting firms operate on a results-based model, meaning their fees are tied to the savings or improvements they deliver - minimising financial risk for your organisation.

Conclusion

Key Points Summary

Understanding and measuring IOPS and throughput is essential for improving performance and managing costs effectively. These metrics impact everything from database efficiency to file transfer speeds.

Accurate benchmarking requires careful preparation and realistic testing. Variables like queue depth, block size, and concurrent operations can heavily influence the results, so considering these factors is critical.

Interpreting benchmark data correctly is just as important. Issues like latency spikes, inconsistent throughput, or IOPS limitations may highlight underlying problems. Establishing baseline metrics and monitoring for deviations will help you detect and address potential concerns early.

Ongoing monitoring ensures your storage solutions align with evolving business demands. By staying proactive, you can identify opportunities for improvement and act accordingly.

Next Steps for Your Business

Start by running baseline tests using reliable tools and set up monitoring systems with clear alert thresholds to track performance.

Take action based on your findings. Poor storage performance can lead to higher expenses, delayed projects, and dissatisfied users. If you notice bottlenecks or inefficiencies, it may be time to consult an expert.

Hokstad Consulting offers expertise in optimising cloud infrastructure and lowering hosting costs while maintaining or enhancing performance. Whether you're planning a cloud migration, need a DevOps overhaul, or require tailored automation solutions, their team can help you turn IOPS and throughput data into meaningful improvements.

Accurate measurement leads to better user experiences, reduced costs, and more stable performance. Establish your baseline today, and use continuous monitoring and expert advice to build a stronger, more efficient foundation for your business.

FAQs

What is the difference between IOPS and throughput, and how do they impact cloud storage performance?

IOPS, or Input/Output Operations Per Second, measures the number of individual read or write operations a storage system can perform in a single second. This makes it particularly useful for evaluating performance in scenarios involving small, random data requests. On the other hand, throughput gauges the amount of data transferred per second, highlighting how well the system handles large, sequential data transfers.

Both metrics play a key role in understanding the performance of cloud storage. A high IOPS is essential for applications like databases, which depend on frequent, small data operations. Meanwhile, high throughput is better suited for tasks such as video streaming or transferring large files. By examining these metrics together, you can optimise your storage configuration to align with the specific demands of your application.

How can I make sure my benchmarking tests reflect real-world application workloads accurately?

To make sure your benchmarking tests truly reflect the demands of real-world cloud storage usage, start by creating workload models that mirror the data access and query patterns of your actual production environment. This approach ensures your results are both meaningful and applicable to your specific needs.

Prioritise assessing key performance indicators (KPIs) that matter most to your organisation, such as latency, IOPS, or throughput. Incorporating mixed workloads - blending operations like reads, writes, and updates - can offer a more accurate view of performance under varied conditions. It's also crucial to regularly update these workload models to match shifts in usage patterns, keeping your benchmarks relevant and useful.

By taking these steps, you’ll be better positioned to fine-tune your cloud storage setup to meet your application's performance demands.

Why should geographical factors and time zones be considered when testing cloud storage performance in the UK?

Geographical factors and time zones are critical when evaluating cloud storage performance in the UK. They directly impact latency, data transfer speeds, and the reliability of services. Choosing data centre locations near your users can noticeably cut down on latency, leading to quicker response times for applications. Spreading resources across several zones further boosts fault tolerance and strengthens disaster recovery plans.

Conducting performance tests from different locations, both within the UK and internationally, can uncover regional differences in service quality. This ensures users across various time zones enjoy a consistent experience. For businesses with a global reach or a diverse customer base, this strategy is essential to maintain reliable performance no matter where users are located or what time it is.