How to Use Flagger for Progressive Delivery

Flagger is a Kubernetes tool that automates progressive delivery strategies like canary and blue-green deployments. This approach reduces deployment risks by gradually rolling out updates to a small subset of users, monitoring performance, and rolling back automatically if issues arise. It integrates with service meshes (e.g., Istio, Linkerd) and monitoring tools like Prometheus to ensure smooth and reliable updates.

Key Benefits for UK Businesses:

Reduced Risk: Gradual rollouts catch issues early, minimising disruptions.
Faster Release Cycles: Companies report 20–30% quicker deployment times.
Cost Savings: Automating processes reduces manual intervention and failed rollouts.

What You’ll Need:

A Kubernetes cluster (v1.16+).
A service mesh or ingress controller (e.g., Istio, Linkerd, NGINX).
A metrics provider like Prometheus for monitoring.
Optional: Horizontal Pod Autoscaler (HPA) for scaling.

Setup Steps:

Prepare the environment by verifying Kubernetes, service mesh, and Prometheus configurations.
Install Flagger using Helm or FluxCD, depending on your preference.
Configure Flagger to integrate with your service mesh and metrics provider.
Define a Canary resource to manage traffic shifts and monitor key metrics like error rates and latency.

Why It Works: By automating traffic management and monitoring, Flagger ensures that only stable updates are promoted. For UK organisations, this means fewer incidents, better service reliability, and compliance with local standards.

If you're ready to improve your deployment process, Flagger offers a reliable way to implement progressive delivery with minimal manual effort.

Flagger on Kubernetes: Progressive Delivery and Canary Deployments

Flagger

Prerequisites and Environment Setup

Before diving into Flagger, it's crucial to ensure your Kubernetes environment is set up correctly. This helps avoid deployment hiccups and ensures smooth operation.

Required Tools and Components

To get started, you'll need the following:

Kubernetes cluster (v1.16 or newer): Flagger depends on this minimum version to access APIs needed for progressive delivery automation [1].
Service mesh or ingress controller: These handle traffic routing between stable and canary deployments. Flagger works with several options like Istio, Linkerd, AWS App Mesh, Open Service Mesh (OSM), and ingress controllers such as NGINX Ingress [1]. Each has its strengths - Istio, for example, offers advanced traffic management, while Linkerd is a simpler, lightweight option.
Prometheus metrics provider: Metrics are key for Flagger's decision-making. Prometheus (or alternatives like Datadog or Tetrate OAP) collects data on error rates and latency, enabling Flagger to decide whether to promote or roll back deployments [1].
Horizontal Pod Autoscaler (HPA) (optional): This works alongside Flagger to manage performance during traffic shifts, ensuring resource scaling policies are upheld [1].

Component	UK-Specific Notes
Kubernetes v1.16+	Use metric units like millicores (CPU) and MiB (memory).
Service Mesh/Ingress	Customise dashboards to display dates in DD/MM/YYYY format.
Prometheus	Configure cost metrics in GBP (£).
HPA (Optional)	Use Celsius for hardware temperature alerts.

Environment Configuration

Namespace isolation is a best practice for managing Flagger's resources. Assign Flagger to a dedicated namespace, such as flagger-system, istio-system, or linkerd, to keep its operations separate from your applications [1]. For application namespaces, ensure they are labelled correctly. For example, enabling Istio's automatic sidecar injection requires this command:

kubectl label namespace <namespace> istio-injection=enabled

This ensures pods in the namespace get the necessary proxy containers [1].

Resource limits are crucial for controlling Flagger's resource usage. Define CPU limits in millicores (e.g., 500m) and memory in mebibytes (e.g., 512Mi), adhering to UK metric conventions. This approach also helps UK organisations manage cloud costs effectively, especially when budgets are calculated in pounds sterling.

Traffic management policies guide how traffic is distributed between deployments. Start with small traffic percentages (e.g., 5–10%) and adjust based on your organisation's tolerance for risk. This gradual approach is particularly suited to sectors like UK financial services, which often prefer conservative strategies.

Monitoring dashboards should be configured with UK localisation in mind. For example, ensure Prometheus and Grafana display dates in DD/MM/YYYY format, use commas for thousand separators, and show currency values in GBP (£) for cost-related metrics [1].

Checking Prerequisites

Before proceeding, verify that your environment meets these requirements:

Cluster health: Check that all nodes are Ready by running:
```
kubectl get nodes
```
Confirm the cluster version with:
```
kubectl version --short
```
Service mesh deployment: Ensure the service mesh pods are running, typically in namespaces like istio-system or linkerd:
```
kubectl get pods -n <namespace>
```
Prometheus connectivity: Locate the Prometheus service and confirm it's accessible on port 9090:
```
kubectl get svc -n <namespace>
```
Custom Resource Definitions (CRDs): Verify the Canary CRD is installed:
```
kubectl get crd
```
If it's missing, install it before deploying Flagger.
HPA availability (if used): Check scaling policies across application namespaces:
```
kubectl get hpa
```

For troubleshooting, use kubectl describe to inspect resources or check pod logs with kubectl logs <pod-name> -n <namespace>. Common issues include missing CRDs, incorrect service mesh injection labels, or Prometheus connection problems.

Once your environment is ready, you're all set to move on to installing and configuring Flagger.

Installing and Configuring Flagger

Once your environment is set up, you can move on to installing Flagger and connecting it to your chosen service mesh. The process is straightforward and can be tailored to your needs.

Installing Flagger with Helm or FluxCD

Helm

Helm installation is the fastest way to get Flagger up and running. To start, add the Flagger Helm repository and install the required Custom Resource Definition (CRD):

helm repo add flagger https://flagger.app
kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml

This two-step process ensures the Canary CRD is installed before proceeding with Flagger.

FluxCD installation, on the other hand, takes a GitOps approach. It offers full traceability by scanning Flagger OCI artifacts and automatically deploying manifests. This method integrates progressive delivery directly into your GitOps workflow.

Installation Method	Best For	Key Benefits
Helm	Smaller teams, quick deployments	Quick setup, easy to script, immediate results
FluxCD	Larger teams, regulated industries	Version control, automated rollbacks, compliance-ready

For UK organisations in highly regulated sectors like finance or healthcare, FluxCD's ability to track changes through Git repositories can be particularly appealing. While it has a steeper learning curve, it’s ideal for meeting compliance and change management requirements.

Regional considerations are crucial when deploying in UK cloud environments. Ensure your Prometheus instance is accessible within your virtual private cloud (VPC) or private cloud setup.

Once Flagger is installed, you can move on to integrating it with your service mesh.

Service Mesh Integration Setup

Flagger supports several service meshes, and the meshProvider parameter determines which one it connects to during installation.

Istio integration is one of the most common setups. To deploy Flagger in the istio-system namespace, use the following command:

helm upgrade -i flagger flagger/flagger \
  --namespace=istio-system \
  --set crd.create=false \
  --set meshProvider=istio \
  --set metricsServer=http://prometheus:9090

For Linkerd integration, the process is similar but uses the linkerd namespace:

helm upgrade -i flagger flagger/flagger \
  --namespace=linkerd \
  --set crd.create=false \
  --set meshProvider=linkerd \
  --set metricsServer=http://linkerd-prometheus:9090

If you’re working with AWS App Mesh, deploy Flagger in the appmesh-system namespace with the following configuration:

helm upgrade -i flagger flagger/flagger \
  --namespace=appmesh-system \
  --set crd.create=false \
  --set meshProvider=appmesh \
  --set metricsServer=http://appmesh-prometheus:9090

The crd.create=false parameter ensures Helm doesn’t try to recreate the Canary CRD you’ve already installed.

For organisations in the UK with distributed infrastructures, multi-cluster setups might be necessary. In multi-cluster Istio environments, Flagger must be installed on each remote cluster. It should also be configured to use the control plane's kubeconfig, which is stored in a Kubernetes secret.

To maintain security in UK cloud environments, ensure network policies allow Flagger to communicate with service mesh components and metrics providers. All communications should remain within your security boundaries and comply with data protection standards.

Connecting to Metrics Providers

Prometheus is Flagger’s primary tool for analysing metrics, helping it decide whether to promote or roll back deployments. The metricsServer parameter connects Flagger to your Prometheus instance.

To configure Prometheus, verify that it is reachable by Flagger on port 9090. Adjust your network policies if necessary to allow this communication.

Adding a pod annotation enables Prometheus to scrape metrics from your application pods. Use the following annotation:

prometheus.io/scrape: "true"

This ensures Prometheus can collect the data Flagger needs for automated decision-making. Adjust Prometheus’ retention settings to meet compliance requirements while maintaining enough historical data for analysis.

While Prometheus is the most commonly used option, Flagger also supports alternatives like Datadog and Tetrate OAP. Your choice will depend on your existing monitoring setup and specific needs.

For UK deployments, security is a priority. Make sure metrics data stays within UK-based cloud regions, and enforce strict access controls to protect sensitive performance metrics. Adding custom labels during installation can also help meet compliance or monitoring requirements. These labels can identify resources by region, compliance status, or other organisational criteria.

With these configurations in place, you’re ready to begin reliable canary deployments.

Implementing Progressive Delivery with Flagger

With Flagger installed and configured, you’re ready to dive into progressive delivery strategies. At its core, this process revolves around canary deployments, which gradually shift traffic to new versions while monitoring performance metrics to ensure a safe and smooth rollout.

Setting Up Canary Deployments

To create a canary deployment, you’ll define a Canary custom resource. This resource acts as a blueprint, linking your main deployment and service while outlining your progressive delivery strategy.

Here’s an example of a Canary resource for a typical web application using NGINX ingress and Prometheus:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: webapp
  namespace: production
spec:
  provider: nginx
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    metrics:
      - name: request-success-rate
        threshold: 99
      - name: request-duration
        threshold: 500

This configuration instructs Flagger to monitor the webapp deployment, manage traffic through NGINX, and evaluate metrics like request success rate and response time. Applying this resource triggers Flagger to generate additional Kubernetes objects, including a primary deployment (the stable version) and a canary deployment (the new version being tested). Canary deployments can be initiated by changes to the deployment’s PodSpec, ConfigMaps, or Secrets, offering flexibility in how updates are rolled out.

For UK organisations handling sensitive data, it’s crucial to include proper labels and annotations for compliance tracking. Additionally, ensure the provider field matches your service mesh or ingress controller, such as istio, linkerd, nginx, or appmesh.

Traffic Shifting and Metrics Analysis

Flagger automates traffic shifting by gradually increasing the percentage of traffic routed to the canary deployment. It continuously monitors metrics like HTTP success rates, average request durations, and pod health to ensure the new version performs as expected.

The traffic control process uses the APIs of your service mesh or ingress controller. For example, Flagger manages virtual services and destination rules when integrated with Istio, while with NGINX, it adjusts ingress annotations to regulate traffic flow.

During each analysis interval, Flagger’s control loop checks metrics against the thresholds you’ve set. In one example from 2023, a UK-based team using Azure Kubernetes Service configured Flagger to shift traffic in 10% increments while monitoring HTTP 5xx error rates and latency. When error rates exceeded the threshold, Flagger automatically rolled back the canary. Over six months, this approach reportedly reduced deployment-related outages by 40% [5].

For UK organisations, it’s critical to ensure metrics data remains within UK-based cloud regions and that audit trails are captured for compliance. Setting realistic thresholds is essential - strict thresholds may cause unnecessary rollbacks, while lenient ones could allow flawed deployments to proceed.

The interval and threshold settings in your Canary resource determine how quickly Flagger reacts to issues. For instance, with a one-minute interval and a threshold of five, Flagger promotes the canary after five consecutive successful checks. This typically results in full rollout completion within 5–10 minutes.

Automated Rollback and Promotion

Flagger’s automated decision-making takes the uncertainty out of deployment management. If the canary doesn’t meet success criteria, Flagger immediately rolls back the deployment by redirecting traffic to the stable primary version and scaling down the canary.

If metrics like error rates or response times breach thresholds, Flagger reverts to the primary deployment to maintain uninterrupted service. On the other hand, if the canary passes all checks, Flagger promotes it by updating the primary deployment and service to use the new version. This ensures a seamless transition with zero downtime, including updates to container images and service routing.

For organisations requiring strict change control, Flagger’s decisions are logged with timestamps, metric values, and the reasoning behind each action. These logs are invaluable for post-incident reviews and compliance audits.

UK businesses prioritising reliability should customise success and failure thresholds based on their needs. Regularly revisiting and refining these thresholds, using historical data, can strike the right balance between speed and safety.

Finally, integrating Flagger with your alerting systems is a smart move. Notifications about rollback events keep your team informed, enabling them to investigate issues and prevent future problems effectively.

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Schedule a 30 minutes, no-obligation call

Monitoring and Managing Deployments

Keeping a close eye on deployments is key to ensuring their smooth operation and addressing potential problems before they escalate. Flagger offers a range of metrics and logging tools that, when effectively visualised and interpreted, provide a clear picture of your deployment performance.

Creating Monitoring Dashboards

To get started, combine Prometheus and Grafana to build dashboards that highlight critical metrics like request success rates, average latency (in milliseconds), error rates, and traffic splits. During canary analysis, Flagger automatically makes these metrics available to Prometheus [1][2]. Include panels in your dashboards to track these indicators for a comprehensive view.

For Prometheus to collect the necessary data, configure it to scrape metrics from pods annotated with prometheus.io/scrape: "true". If your organisation requires detailed records, consider adding panels that log deployment events, including timestamps for promotions and rollbacks.

These dashboards serve as a foundation for assessing deployment performance and making informed decisions when adjustments are needed.

Deployment Performance Analysis

To evaluate how well your deployments are performing, it’s crucial to understand the metrics and logs Flagger generates. Through its canary analysis, Flagger provides status updates, evaluates metrics, and logs significant events. Aim to keep error rates below 1% and latency under 500ms for optimal results.

Pay close attention to traffic split percentages as Flagger gradually shifts traffic from the primary release to the canary version. Detailed event logs capture triggers for promotions, reasons for rollbacks, and breaches of metrics, complete with timestamps. This information helps pinpoint exactly when and why a deployment might have encountered issues [4][5]. By studying patterns across multiple deployments, you can identify recurring problems - such as consistently high latency during canary phases - and make necessary adjustments. For example, you might modify traffic shifting intervals or address underlying performance bottlenecks.

This kind of analysis helps you refine your deployment strategy and stay ahead of potential challenges.

Monitoring Best Practices

Once your dashboards are in place and you’ve gathered performance insights, follow best practices to maintain deployment reliability. Configure Prometheus alert rules for critical situations, such as error rates exceeding 1%, latency going over 500ms, or failed canary promotions. Use Grafana notifications to ensure your team receives timely alerts via email, Slack, or other channels [1][3]. For UK-based operations, ensure alert timestamps are adjusted for GMT or BST, and include specific metric values along with actionable recommendations in the alert messages.

To avoid unnecessary alerts and ensure critical issues are flagged, review your alert thresholds monthly. As your environment evolves, update your dashboards to reflect changes in application architecture or metric naming, and schedule regular reviews to keep everything accurate [1][2].

Test rollback procedures in non-production environments to confirm that Flagger’s automation works as expected. Maintain runbooks documenting common deployment patterns and their typical metric behaviours. These can help team members distinguish between normal variations and incidents that require immediate attention.

If you’re looking for expert advice to optimise your monitoring setup, Hokstad Consulting can assist in customising dashboards, alerting strategies, and automation to meet your organisation’s specific needs while adhering to UK operational standards.

Uninstalling Flagger and Resource Cleanup

When it's time to remove Flagger from your Kubernetes cluster, it's important to follow a clear process to ensure no leftover resources clutter your environment. This involves uninstalling the application, deleting CRDs, and verifying that all associated resources are completely removed.

Removing Flagger with Helm or Kubectl

If Flagger was installed using Helm, you can uninstall it by running the following command in the appropriate namespace:

helm delete flagger

This command removes all Kubernetes components tied to the Flagger Helm chart and deletes the release. However, it does not automatically remove the Canary Custom Resource Definition (CRD), meaning some custom resources and objects created by Flagger might remain.

For installations done with kubectl (e.g., via Kustomize or direct YAML manifests), you can remove the Flagger deployment and associated resources with:

kubectl delete -f <manifest.yaml>

Again, while this removes Flagger itself, the Canary CRD and any related custom resources will still be present.

To fully clean up, make sure to delete the Canary CRD by running:

kubectl delete crd canaries.flagger.app

This step ensures all resources tied to the CRD are removed.

Cleaning Up Remaining Resources

After uninstalling Flagger, it's critical to check for and remove any remaining resources to keep your cluster tidy. Use the following kubectl commands to identify leftover Canary resources, deployments, or services:

To list all Canary resources:
```
kubectl get canaries --all-namespaces
```

To find lingering deployments:

kubectl get deployments -A | grep flagger

To check for Flagger-related services:
```
kubectl get svc -A | grep flagger
```

If any resources are found, delete them manually. For example, to remove a deployment:

kubectl delete deployment <name> -n <namespace>

To delete a service:

kubectl delete svc <name> -n <namespace>

Additionally, review and remove any Flagger-specific namespace labels or annotations that are no longer needed. This step ensures your cluster remains well-organised and compliant with operational standards.

For larger or more complex environments, consider automating this process with a shell script. The script can sequentially execute the necessary commands - like helm delete, kubectl delete crd, and resource cleanup commands - to provide a consistent and repeatable cleanup routine.

If Flagger was deployed across multiple namespaces or integrated with service meshes such as Istio, Linkerd, or App Mesh, repeat the cleanup process in each relevant namespace. Check for any mesh-specific resources and ensure they are removed.

For UK organisations managing intricate Kubernetes environments, Hokstad Consulting offers expert DevOps and cloud infrastructure services. They can help simplify your Kubernetes operations, establish effective cleanup procedures, and prevent resource sprawl during tool decommissioning, ensuring your infrastructure remains streamlined and compliant.

Summary and Next Steps

Key Takeaways

Flagger simplifies progressive delivery, cutting deployment issues by up to 80% while ensuring services remain uninterrupted. Its compatibility with service meshes like Istio and Linkerd, as well as metrics tools such as Prometheus and Datadog, makes it an excellent choice for businesses navigating complex Kubernetes setups. By gradually shifting traffic and tracking performance indicators in real time, organisations can maintain high-quality services and speed up their release cycles.

For example, a DigitalOcean Kubernetes user saw deployment failures drop from 18% to 4% within three months by leveraging Prometheus integration and automatic rollback features [6].

For businesses in the UK, Flagger is well-suited to meet operational requirements while aligning with compliance standards - critical for industries like finance and healthcare.

To fully leverage these advantages, expert support can help streamline Flagger's integration into your systems.

Getting Expert Help from Hokstad Consulting

To unlock Flagger's full potential, expert assistance is key. Hokstad Consulting helps UK businesses achieve deployment speeds up to 75% faster while reducing errors by 90% through tailored DevOps transformations [7].

Their approach to cloud cost engineering can cut infrastructure expenses by up to 50% while improving reliability [7]. For instance, one tech startup reduced its deployment time from six hours to just 20 minutes, and other clients have reported up to a 95% drop in downtime caused by infrastructure issues [7].

Hokstad Consulting specialises in implementing automated CI/CD pipelines, Infrastructure as Code, and monitoring systems to remove manual inefficiencies and reduce human error.

They also provide a free assessment of your current Kubernetes and CI/CD setup, identifying areas for optimisation to support Flagger integration. For organisations ready to adopt Flagger, partnering with Hokstad Consulting ensures seamless integration with existing systems, cost-effective scaling, and compliance with UK operational standards. Plus, their no savings, no fee model ensures you only pay based on measurable improvements, making it a low-risk way to enhance your progressive delivery capabilities [7].

FAQs

What should UK businesses consider when using Flagger for progressive delivery?

When using Flagger for progressive delivery, businesses in the UK should pay attention to a few important factors to ensure everything runs smoothly.

First, make sure your Kubernetes cluster is correctly set up and has the necessary resources to support Flagger. Since progressive delivery depends heavily on monitoring and managing traffic, a solid infrastructure is a must. If your business operates in the UK, don’t forget to factor in compliance with data protection laws like GDPR when handling application metrics and logs.

Next, customise your Flagger configuration to match your business objectives. For example, you could implement canary releases to roll out updates gradually. This method allows you to monitor performance and gather user feedback, which can help minimise risks like downtime and enhance user experience. If you need expert advice, services such as Hokstad Consulting can assist in fine-tuning deployment strategies and tailoring cloud infrastructure to meet UK-specific requirements.

How does Flagger work with service meshes and monitoring tools to ensure smooth and reliable updates?

Flagger works effortlessly with popular service meshes such as Istio, Linkerd, and Consul, streamlining traffic routing during progressive delivery. By using these service meshes, Flagger can gradually redirect traffic between different application versions, reducing disruptions and maintaining a reliable user experience.

To keep updates running smoothly, Flagger integrates with monitoring tools like Prometheus and Datadog. These tools track key metrics like error rates and latency, enabling Flagger to make data-driven decisions - whether to proceed with, pause, or roll back an update. This approach ensures that updates remain stable while safeguarding the performance of your services.

What challenges might arise when setting up Flagger for progressive delivery, and how can they be resolved?

Setting up Flagger for progressive delivery might feel a bit tricky at first, but with the right steps, you can navigate these challenges smoothly.

One of the primary hurdles is ensuring your Kubernetes setup is configured correctly to support Flagger. This includes verifying that all prerequisites are in place - like a compatible service mesh such as Istio or Linkerd. Missteps or misconfigurations in these areas can lead to deployment hiccups or delays.

Another potential challenge lies in tailoring Flagger's configuration to align with your delivery objectives. It's a good idea to start with small, gradual traffic shifts during canary deployments. This approach reduces risk and allows you to monitor performance closely using Flagger's built-in metrics. Be ready to tweak thresholds and rollback settings to maintain system stability.

With a well-prepared environment and by taking full advantage of Flagger's capabilities, you can simplify the setup process and establish a reliable progressive delivery pipeline.