How to Benchmark Service Mesh Throughput

Service mesh throughput measures how much data moves between microservices in Kubernetes environments. Benchmarking this helps organisations in the UK understand the trade-offs between performance and features like security (e.g., mTLS) or observability. For instance, enabling mTLS in Istio can reduce throughput by 95% and add extra latency. This article explains how to set up a Kubernetes benchmarking environment, test service mesh configurations (like Istio, Linkerd, and Cilium), and interpret results to improve performance and reduce cloud costs.

Key Points:

Why it matters: Service meshes add overhead but offer features like encryption and traffic management. Benchmarking helps balance these benefits against performance impacts.
Setup: Use tools like Fortio and Meshery to simulate traffic and collect metrics (e.g., latency, CPU usage, error rates).
Testing: Compare baseline Kubernetes networking with sidecar-based (e.g., Istio, Linkerd) and sidecarless (e.g., Cilium) architectures.
Results: Istio often has the highest overhead, while sidecarless options like Cilium are more efficient.
Cost impact: High resource usage can double cloud expenses. Optimising configurations can save £50,000+ annually.

This guide includes step-by-step instructions for benchmarking and tips for improving service mesh performance while managing costs.

A Service Mesh Benchmark You Can Trust - Denis Jannot, solo.io

solo.io

Setting Up the Benchmarking Environment

Creating a reliable benchmarking environment requires a consistent setup using standardised tools and infrastructure. The key to accurate service mesh performance testing is replicating production-like conditions while minimising variables that could distort results.

Preparing a Kubernetes Cluster

Kubernetes

A Kubernetes cluster serves as the foundation for dependable service mesh benchmarking. To ensure accurate results, nodes should have at least 4 vCPUs and 8–16 GB of RAM, though 8 vCPUs and 32 GB of RAM are recommended for more precise testing. Use dedicated network interfaces or SR-IOV, and isolate nodes using namespaces, selectors, and clock synchronisation. Pinning virtual machine threads to specific CPU cores and using a single NUMA node can also help minimise cross-node latency, which is crucial for reducing tail latency during tests [3].

To maintain consistency, tools like Terraform or Ansible are invaluable. They eliminate manual configuration discrepancies that could introduce errors. By standardising node images and automating network configurations, you ensure that every test starts from an identical baseline.

Node isolation is equally important. Disable autoscaling and background jobs, and use dedicated namespaces and node selectors to prevent interference from external workloads. These steps help create a controlled environment for benchmarking.

Installing and Configuring Benchmarking Tools

Deploy Fortio either as a pod or a standalone binary on dedicated nodes [4]. Configure it to specify target endpoints, traffic parameters, and measurement intervals. Fortio is particularly effective at generating HTTP and gRPC traffic, providing detailed latency percentiles which are critical for performance analysis.

Meshery is another excellent tool, offering orchestration capabilities via Helm charts or Kubernetes operators [3]. Its user-friendly web interface simplifies the creation of test scenarios and supports cross-mesh performance comparisons. Meshery integrates seamlessly with various service mesh implementations, making it a versatile option for benchmarking workflows.

The Service Mesh Performance (SMP) standard provides a unified approach to metrics collection and reporting across different service meshes. When integrated with Meshery, SMP automates result collection, while its standalone CLI offers flexibility for custom tests [5]. SMP also provides a universal performance index, making it easier to compare different service mesh configurations.

For UK-specific deployments, configure tools to use appropriate time zones, date formats, and measurement units. Enable persistent storage for test results to facilitate historical analysis and identify performance trends over time.

With benchmarking tools in place, the next step is to deploy and test the service mesh solutions.

Deploying Service Mesh Solutions

After setting up the benchmarking tools, deploy service mesh solutions to evaluate their impact on traffic management. Document all versions and configuration flags to ensure reproducibility across tests.

For Istio, use the official Helm charts, configuring sidecar injection (automatic or manual) as needed. Note that enabling mTLS can significantly affect performance, with studies indicating throughput reductions of up to 95% [4].

Linkerd focuses on simplicity, with its CLI tool offering built-in validation checks and automatic proxy injection, requiring less configuration effort compared to Istio.

Cilium provides both traditional sidecar-based and modern sidecarless architectures. In sidecarless mode, node-level agents handle service mesh functionality, eliminating the need for individual container proxies [4]. Configuration options include WireGuard or IPSec encryption for secure communication benchmarking. Cilium’s eBPF-based approach often results in lower resource usage compared to sidecar-based setups.

For comparative testing, deploy the same workloads across different service mesh configurations. Include baseline tests without a service mesh, tests with sidecar-based setups, and tests using sidecarless implementations where applicable.

Hokstad Consulting’s automation expertise can help streamline this process. Their DevOps services simplify repetitive infrastructure tasks, ensuring consistent benchmarking environments and reducing setup time. This approach enables UK organisations to establish robust testing frameworks that reflect real-world performance characteristics.

Designing and Running Benchmark Tests

When setting up benchmark tests for service mesh implementations, it's essential to isolate variables to accurately measure their impact compared to a baseline.

Baseline Testing Without a Service Mesh

Start by creating a baseline using Kubernetes' native networking via kube-proxy. This baseline acts as your reference point to understand the overhead introduced by service mesh components [4].

Deploy a simple client-server application without any service mesh components. Use identical pod specifications and resource allocations that will remain consistent across all tests. The client should generate HTTP or gRPC traffic to the server using tools like Fortio, which provides detailed latency percentiles crucial for analysis.

Run each test at a constant request rate for 5 minutes, repeating the process to ensure the results are stable and show minimal variance [4]. Gather key metrics such as throughput (requests per second), latency percentiles (50th, 99th), CPU usage, memory consumption, and error rates. These metrics will serve as your performance benchmark.

To ensure accurate results, keep the test environment free from external interference. Use dedicated namespaces and node selectors to maintain isolation during testing.

Once you've established your baseline, you can apply these principles to service mesh configurations.

Configuring Service Mesh Test Scenarios

Using the baseline as your reference, test service mesh scenarios under identical conditions. Maintain consistent workloads, durations, and infrastructure to isolate the impact of the service mesh architecture [4].

For sidecar-based architectures like Istio and Linkerd, inject sidecar proxies into your application pods. Test two configurations: one with default settings (no mTLS) and another with mTLS enabled. When enabling mTLS, document the cryptographic algorithms, certificate management processes, and whether enforcement applies universally or to specific traffic routes.

It's worth noting that mTLS can have a noticeable impact on performance. For example, Istio's sidecar configuration often results in reduced throughput and increased latency [2][4].

For sidecarless architectures like Cilium or Istio Ambient, deploy per-node agents and configure mTLS as needed. These setups tend to have lower resource overhead compared to traditional sidecar-based meshes, making them a popular choice for environments where performance is critical [4].

Record every configuration parameter for each test scenario, including mesh version, injection methods, security policies, and node placement. This documentation ensures that tests are reproducible and facilitates meaningful comparisons between different implementations.

Generating Workloads and Traffic Patterns

With your test scenarios ready, the next step is to simulate traffic patterns that mimic real-world usage. Use Fortio to generate HTTP and gRPC traffic at varying request rates, concurrency levels, and payload sizes that reflect your production environment [4].

Design traffic patterns that include both steady and bursty loads. Begin with lower request rates and gradually increase them to observe how each service mesh configuration scales. Test at rates such as 100, 500, 1,000, and 2,000 requests per second with 8, 16, and 32 concurrent connections. This approach helps build a comprehensive performance profile.

Ensure consistency in request rates, connection counts, and durations across all scenarios. Monitor resource usage during traffic generation, as high CPU consumption on client nodes can become a bottleneck, potentially skewing results. If needed, deploy multiple client instances or use dedicated load generation nodes to avoid these issues.

During each test, capture detailed metrics like requests per second, latency percentiles, error rates, and resource usage on both client and server nodes. The Service Mesh Performance (SMP) standard provides a uniform framework for collecting and reporting these metrics, making it easier to compare results across different service mesh implementations [3][5].

Store all test results with timestamps and configuration details for future analysis. This data is invaluable for identifying trends and validating performance improvements over time, helping you optimise both performance and costs in cloud environments.

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Schedule a 30 minutes, no-obligation call

Analysing Benchmark Results

When evaluating service mesh configurations, analysing benchmark data is essential to balance performance needs with operational costs. By interpreting the metrics properly, you can make decisions that align with both user expectations and budget constraints.

Key Metrics for Throughput Analysis

Certain metrics are pivotal in assessing infrastructure performance and its impact on user experience. Latency percentiles are particularly useful. The 50th percentile shows typical response times, while the 99th percentile highlights worst-case scenarios, often referred to as tail latency. For example, Istio's sidecar proxy adds around 3ms latency at the 50th percentile and 10ms at the 99th percentile under a load of 1,000 RPS across 16 connections [2]. Even small increases in latency can have a significant effect under heavy loads.

Resource consumption, specifically CPU and memory usage, is another critical factor. These directly affect monthly costs, as higher usage often requires larger and more expensive instances. Additionally, error rates provide insight into reliability issues that could impact service level objectives (SLOs).

Pay attention to the gap between latency percentiles. A large disparity between the 50th and 99th percentiles often signals bottlenecks or instability under load, which can negatively affect latency-sensitive applications. Identifying these variations is key to understanding performance issues and making meaningful comparisons.

Presenting and Comparing Results

Clear presentation of benchmark results is essential for identifying trends and making decisions. A simple markdown table can effectively summarise key metrics:

Configuration	RPS	50th Latency (ms)	99th Latency (ms)	CPU Usage (%)	Memory Usage (MB)	Error Rate (%)
Baseline	10,000	2	5	30	200	0
Istio	2,000	5	15	60	400	0.5
Linkerd	4,000	3	8	40	250	0.2

This table illustrates the performance trade-offs clearly. Istio introduces the most overhead, cutting throughput by 80% compared to the baseline, while also doubling resource usage. Linkerd, on the other hand, retains 40% of baseline throughput with more moderate resource demands.

It's important to account for variability across multiple test runs. High variance may indicate environmental or configuration issues that could skew results. Using the Service Mesh Performance (SMP) specification ensures consistency in how metrics are collected and reported, making comparisons more reliable [3][5].

Always include a baseline for reference. Without it, you can’t accurately measure the impact of introducing a service mesh.

Impact on Cloud Costs and DevOps

Benchmark results have implications beyond technical performance - they directly influence costs and operational strategies. Increased CPU and memory usage translates to higher monthly cloud expenses. For instance, if your baseline setup runs on 4-core instances costing £200 per month, doubling CPU usage due to a service mesh could require 8-core instances at £400 per month. Across a large-scale deployment, these costs can add up quickly.

Performance degradation also affects operational complexity. Reduced throughput may require additional replicas to maintain service levels, increasing resource usage and complicating scaling efforts. Higher latency might trigger false alerts in monitoring systems or necessitate changes to health checks and circuit breaker settings.

However, these challenges also present opportunities for optimisation. For example, if your benchmarks show minimal benefit from universal mTLS in your environment, disabling it could significantly improve performance and reduce resource demands.

Hokstad Consulting has demonstrated that UK businesses can achieve 30-50% reductions in cloud infrastructure costs by applying optimisation strategies informed by performance analysis [1]. Their approach combines benchmark insights with practical cost-saving techniques, helping organisations fine-tune their service mesh deployments for both performance and budget efficiency.

Use your benchmark data to weigh the trade-offs between features and performance. Not every service mesh capability, such as detailed telemetry or advanced routing rules, will be worth the performance cost in every scenario. By tailoring configurations to your specific needs and budget, you can strike the right balance and optimise your setup for both performance and cost management.

Best Practices for Service Mesh Benchmarking

Benchmarking a service mesh demands a disciplined and consistent approach to ensure the results are reliable, repeatable, and useful for enhancing performance over time.

Ensuring Consistent Test Results

To get dependable results, it's crucial to isolate your benchmarking environment. Dedicate specific nodes or entire clusters, and keep configurations identical. This prevents external factors, like background workloads, from skewing your measurements and avoids the noisy neighbour issues often seen in shared environments.

Running benchmarks multiple times is key to statistical reliability. Aim to execute each test scenario at least three to five times. Use averages and variance metrics to spot inconsistencies. If you notice high variance, it could signal environmental issues or configuration drift that need fixing before you can trust the results.

Timing also plays a role. Cloud environments often experience fluctuating loads throughout the day, which can impact network latency and resource availability. Schedule your benchmarks during off-peak hours and steer clear of maintenance windows or major deployments that might interfere with performance.

For high-fidelity benchmarks, pin processes to specific CPU cores. This keeps the scheduler from moving processes between cores, reducing context switching and improving consistency. Disabling CPU frequency scaling during tests also ensures stable processing power throughout the benchmarking process.

Once you've established consistent testing methods, detailed documentation becomes essential for translating raw data into actionable insights.

Documenting and Reporting Results

Good documentation turns raw test data into something useful. Record all test parameters - such as the date (in DD/MM/YYYY format), cluster specifications, and software versions - using standardised templates. Version-controlled scripts make it easier to replicate tests and maintain consistency.

Save both raw and processed data. This allows team members to verify calculations or dive deeper into the analysis if needed. When reporting, present cost-related data in pounds sterling (£) to maintain clarity across teams and test runs.

Visual aids can help, but they shouldn't replace detailed numbers. For example, charts showing latency distributions or throughput trends can highlight patterns that might not stand out in tables. Just ensure these visuals complement, rather than overshadow, the underlying data.

If you encounter anomalies or unexpected results, document them thoroughly. These observations often reveal valuable insights into system behaviour. Include any troubleshooting steps you took and their outcomes so others can learn from your experience.

Improving Service Mesh Performance

Improving service mesh performance starts with identifying where overhead occurs and applying targeted solutions. You can tweak proxy settings, such as connection pools, or explore alternatives like sidecarless architectures to cut down on resource consumption. Other strategies include enabling mutual TLS (mTLS) selectively, disabling unused features, and fine-tuning resource limits based on actual usage.

Sidecarless architectures are a promising way to reduce overhead. Traditional sidecar models, like those used in Istio, add extra containers per pod, consuming CPU and memory while introducing additional network hops. Alternatives like Cilium or Istio Ambient centralise proxy functions, which can significantly lower resource usage per pod. Benchmarks show these models often deliver better throughput and lower latency, especially at scale.

When configuring mTLS, balance security needs with performance. While mTLS provides robust security, it can introduce cryptographic overhead that slows things down. Instead of applying it universally, consider enabling it only for sensitive services. This approach maintains security where it matters most without unnecessary performance hits.

Simplifying the data path can also improve performance. Each proxy a request passes through adds latency and consumes resources. Review your service mesh configuration to eliminate unnecessary routing complexity while maintaining functionality.

Finally, make benchmarking a regular part of your workflow. Integrate performance tests into your CI/CD pipelines to catch issues early and monitor the impact of configuration changes over time. This proactive approach helps you stay ahead of performance problems rather than reacting to them after they occur.

By refining these practices, you can ensure your benchmarks provide meaningful insights while aligning with operational and cost goals. These steps not only improve benchmark accuracy but also support broader DevOps efforts and cost efficiency.

Hokstad Consulting's expertise in DevOps transformation and cloud cost engineering is particularly helpful for UK businesses looking to implement these strategies. Their methodical approach to performance analysis helps organisations identify impactful changes while staying within budget and meeting operational needs.

Conclusion

Drawing from the detailed testing and analysis discussed earlier, this section highlights the key insights needed to fine-tune service mesh performance. Benchmarking service mesh throughput plays a crucial role in making informed Kubernetes decisions. The results shed light on performance trade-offs that directly influence efficiency and cloud costs, offering a clearer understanding of the balance between performance and expenditure in service mesh solutions.

Key Takeaways

The benchmarking process underscores how mesh selection and configuration significantly affect performance. For example, enabling mTLS (mutual TLS) introduces noticeable latency, with the extent varying across different service mesh implementations. At higher traffic levels, such as 1,000 requests per second, these performance impacts become particularly evident, as shown in previous tests[2].

Baseline testing - conducted without a service mesh - provides a critical reference point. It allows organisations to measure the additional overhead introduced by a service mesh, helping them evaluate whether the benefits of enhanced security and observability outweigh the performance costs.

Emerging approaches, like sidecarless architectures, present compelling alternatives to traditional sidecar models. Among the tested solutions, Linkerd consistently delivers better performance while using fewer resources[8].

Reliable benchmarking depends on consistent measurement practices. By running multiple test iterations, isolating environments, and adhering to standards such as the Service Mesh Performance (SMP) specification, organisations can achieve credible results. These practices enable meaningful comparisons across different service mesh solutions[3][5].

The financial impact of these performance differences is substantial. Benchmarks reveal that some service meshes not only reduce latency but also consume fewer resources, leading to lower cloud infrastructure costs and more efficient utilisation of compute resources[6]. With 70% of organisations already using service meshes in production or development, as reported by CNCF surveys[7], these differences can have a significant effect on operational budgets.

How Hokstad Consulting Can Help

Hokstad Consulting

Turning these performance insights into actionable strategies often requires expert guidance. Hokstad Consulting provides invaluable support for UK businesses looking to optimise service mesh implementations. Their expertise in DevOps transformation and cloud cost engineering helps companies identify impactful changes that balance performance with cost-efficiency.

Through their Cloud Cost Engineering service, Hokstad Consulting has enabled businesses to reduce infrastructure costs by 30-50%, all while improving system performance.

Additionally, their custom development and automation services simplify the benchmarking process by embedding performance tests directly into CI/CD pipelines. This integration ensures that performance issues are identified early and that the effects of configuration changes are continuously monitored, supporting long-term improvement in service mesh management.

Hokstad Consulting also offers a free assessment to help businesses uncover specific optimisation opportunities within their current setups. By tailoring their approach to align service mesh solutions with both technical needs and business goals, they ensure that benchmarking insights lead to measurable, long-term success.

FAQs

What impact does enabling mTLS in a service mesh like Istio have on performance, and how can these effects be minimised?

Enabling mTLS (mutual Transport Layer Security) in a service mesh like Istio is a smart move for boosting security. It encrypts communication between services, ensuring data stays protected. But there’s a catch - this added layer of security can come with a performance cost. You might notice reduced throughput and increased latency because encryption and decryption require extra computational power.

To keep these effects in check, you can tweak the service mesh configuration. This could mean fine-tuning resource limits, cutting down on unnecessary service-to-service traffic, or even upgrading your infrastructure to better handle the workload. The key is to run thorough performance tests and benchmarks to figure out what works best for your setup.

Hokstad Consulting offers expertise in optimising DevOps workflows and cloud infrastructure, helping businesses strike the right balance between performance and cost efficiency.

What are the performance and resource usage benefits of using sidecarless architectures like Cilium compared to traditional sidecar-based service meshes?

Sidecarless architectures, like Cilium, bring notable benefits in terms of performance and resource efficiency compared to traditional sidecar-based service meshes. By doing away with the need for sidecar proxies, these architectures eliminate the extra overhead of running additional containers alongside your applications. The result? Lower CPU and memory consumption, which means more resources are available for other workloads.

Another advantage is the more direct and streamlined data paths that sidecarless designs provide. This can boost throughput and minimise latency in service-to-service communication. These qualities make sidecarless architectures an attractive choice for environments where performance and resource efficiency are top priorities - think high-traffic Kubernetes clusters or deployments where controlling costs is crucial.

How does benchmarking service mesh performance help reduce cloud costs, and what steps can organisations take to optimise their setup?

Benchmarking the performance of a service mesh plays a key role in spotting inefficiencies in how resources are used and boosting overall system efficiency. By carefully analysing throughput and fine-tuning configurations, organisations can ensure their resources are appropriately allocated, which can significantly cut down on cloud infrastructure expenses.

Hokstad Consulting offers specialised expertise in cloud cost engineering and DevOps optimisation. They help businesses achieve savings of 30–50% while also improving performance. Their customised strategies are designed to enhance resource allocation and streamline deployment processes, ensuring organisations maximise the value of their cloud investments.