Managing multiple Kubernetes clusters can be complex, but combining GitOps with workflow orchestration simplifies deployments, ensures compliance, and controls costs. Here's what you need to know:
- Multi-Cluster Kubernetes: Organisations use multiple clusters (e.g., by region or environment) for isolation, compliance, and resilience. In the UK, data residency laws like GDPR drive regional setups.
- GitOps: Treats Git as the single source of truth for configurations, with tools like Argo CD or Flux ensuring clusters match the desired state defined in Git.
- Workflow Orchestration: Tools like Argo Workflows and Tekton handle deployment steps - testing, validation, and environment promotion - to maintain consistency and reduce chaos.
- Key Patterns: The hub-and-spoke model centralises management, while repository strategies (monorepo, polyrepo, hybrid) impact scalability and team autonomy.
- UK Context: Compliance with UK GDPR and cost control are priorities. Combining GitOps with FinOps principles helps manage cloud expenses billed in pounds (£).
For UK organisations, this approach offers structured solutions to manage clusters efficiently while meeting regulatory and financial goals. Tools like Argo CD, Flux, and Tekton streamline operations, enabling better governance and reduced operational risks.
Core Concepts and Patterns in Multi-Cluster GitOps
Multi-Cluster Management Patterns
The hub-and-spoke model stands out as the go-to approach for managing multiple clusters. At its core, a central management cluster, or the hub
, acts as the control plane. This hub oversees deployments and sends configurations to the workload clusters, or spokes
, creating a clear division of responsibilities.
An important architectural decision within this model revolves around the GitOps controller. A centralised controller offers a unified perspective across clusters but introduces a potential single point of failure. On the other hand, decentralised agents empower spoke clusters to reconcile changes independently, providing resilience [6][8].
When it comes to applying changes, there are two main methods: the push model and the pull model. In the push model, the hub directly connects to spoke clusters, while the pull model relies on lightweight agents in each spoke cluster to fetch configurations from Git or the hub. The pull model is often preferred in private or highly regulated environments because it eliminates the need for open inbound ports and reduces the credentials stored in the hub [10].
These decisions around management patterns directly influence how repositories are structured and how GitOps strategies are implemented, as explored below.
GitOps Strategies for Multi-Cluster Management
When working with the hub-and-spoke model, repository structure becomes a key factor in managing clusters effectively. The structure of your repositories can significantly impact how easily you can scale operations. Three main approaches are commonly used: monorepo, polyrepo, and hybrid.
- Monorepo: All cluster and application manifests are stored in a single repository. This simplifies dependency tracking but can lead to bottlenecks as the number of teams grows.
- Polyrepo: Each team or service has its own repository, which allows for greater autonomy but adds to operational complexity.
- Hybrid: Combines the benefits of both. A shared platform repository handles infrastructure while individual team repositories manage application configurations. This approach is often the most scalable.
Spotify's journey provides an excellent example. To manage over 1,800 microservices, they transitioned from a monorepo to a hybrid model. They kept platform infrastructure in a shared repository while team-specific configurations were stored in separate repositories. Spotify leveraged Argo CD ApplicationSets to dynamically generate application resources from both sources [5]. Without automation like this, managing resources for hundreds of microservices across multiple environments would quickly become overwhelming [5].
The biggest challenge in scaling GitOps is not technology - it's organisational design.- Wasil Zafar, Modern DevOps & Platform Engineering [5]
No matter which strategy you adopt, establishing a standard directory structure early on is crucial. A clear separation between clusters/ (locations for deployment), apps/ (application specifics), and infra/ (shared tools) helps avoid structural issues that can be difficult to resolve later [10].
Workflow Orchestration Basics
After defining your repository strategy, the next step is orchestrating workflows. This involves coordinating how changes in Git are applied to clusters. Workflow orchestrators use Directed Acyclic Graphs (DAGs) to map out tasks and their dependencies, ensuring steps are executed in the right order while allowing for parallel execution where possible.
In a multi-cluster CI/CD pipeline, tools like Argo Workflows or Tekton can manage tasks such as testing, policy checks, and staged rollouts. A typical sequence might involve deploying changes to Development first, then Staging, and finally Production. This staged approach minimises the risk of a single bad commit affecting all clusters - a scenario known as blast radius.
Within individual clusters, tools like Argo CD offer features like sync waves and hooks, which ensure resources are deployed in the correct order. For instance, a database migration might need to complete before a new application version is rolled out. Similarly, Flux provides ordering capabilities using its dependsOn feature [9].
These orchestration principles are essential for managing coordinated deployments across multiple clusters, laying the groundwork for the next steps in integrating GitOps into multi-cluster CI/CD workflows.
Need help optimizing your cloud costs?
Get expert advice on how to reduce your cloud expenses without sacrificing performance.
Integrating Workflow Orchestration with Multi-Cluster GitOps
Architecture for Combining Workflow Orchestration and GitOps
The integration between workflow orchestration and GitOps relies on a division of responsibilities. The workflow orchestrator handles operational tasks like building images, running tests, and updating configurations. Meanwhile, the GitOps controller ensures the declared state is enforced.
Here’s how it works: a CI pipeline clones the manifest repository, updates the image tag, and pushes the changes back to Git. From there, the GitOps controller detects the changes and reconciles the cluster state automatically. This setup ensures that the CI pipeline does not need direct access to the cluster:
The CI pipeline should not directly deploy... we essentially maintain separate repositories for the application code, and the cluster configuration.- Stakater Playbook
To further enhance reliability, deploy individual GitOps controller instances for each workload cluster instead of relying on a single centralised instance. This approach reduces the blast radius of misconfigurations and ensures that spoke clusters remain operational, even if the hub experiences downtime [6][1].
Coordinating Multi-Cluster Deployments
When managing multi-cluster deployments, coordination is key. Tools like Argo CD’s Sync Waves and Hooks allow for precise sequencing of tasks. For example, PreSync hooks can handle database migrations, while PostSync hooks can run smoke tests. If an issue arises, a SyncFail hook can trigger an automated rollback [9]. Similarly, Flux’s dependsOn feature ensures that infrastructure components, such as service meshes and CRDs, are properly reconciled before application workloads are deployed [7][12].
For large-scale operations, Argo CD ApplicationSets with Matrix Generators simplify deployments. This feature enables multiple services to be deployed across multiple clusters using a single templated definition, eliminating the need to manage hundreds of individual Application resources. For instance, DigitalOcean managed to cut its multi-cluster deployment time from over 30 minutes to just 5 minutes - an impressive 83% reduction - by leveraging this approach.
These methods provide a solid foundation for practical, real-world implementations.
Use Cases with Argo Workflows, Tekton, and Flux

Real-world examples highlight how these principles improve multi-cluster operations. Take OneUptime’s March 2026 case study, for instance. They integrated Argo Workflows with Flux CD, using Argo Events to trigger CI pipelines. These pipelines built container images with Kaniko, updated manifests in a Flux-managed repository, and allowed Flux to reconcile the new state across Kubernetes v1.26+ clusters. Notably, this setup maintained the principle of keeping pipelines free from direct cluster access [11].
For teams using Tekton, the workflow is similar. A Tekton Pipeline manages the build and test stages, with a final task committing the updated image tag to the GitOps repository. Flux’s native support for automated image updates detects new image tags in a registry and either opens a pull request or commits the change directly, eliminating the need for an additional CI step. In contrast, Argo CD requires its Image Updater add-on to achieve the same functionality [9].
A standout example of GitOps at scale comes from Capital One. Their implementation requires every production change to pass 14 automated policy checks, including image scanning and network policy validation, and to receive two approved reviews before Argo CD reconciles it to production clusters. This rigorous process reduced their audit preparation time from six weeks to just three days [5].
GitOps Me Some of That! Managing Hundreds of Clusters with Argo CD - Mike Tougeron, Adobe, Inc.

Tools for Multi-Cluster GitOps and Workflow Orchestration
::: @figure
{Argo CD vs Flux CD vs Argo Workflows vs Tekton: Multi-Cluster GitOps Tools Compared}
:::
When it comes to managing multi-cluster GitOps environments, choosing the right tools can make all the difference. Below, we'll dive into the GitOps controllers and workflow orchestrators that drive efficient and secure CI/CD pipelines.
GitOps Controllers for Multi-Cluster Environments
Two major players dominate the GitOps controller space: Argo CD and Flux CD. Each offers a distinct approach to managing multi-cluster setups.
Argo CD operates using a centralised hub-and-spoke model. A single instance oversees multiple remote clusters, providing a unified view of your infrastructure. With its detailed web interface and visual diffs, it empowers developers to work independently without needing deep Kubernetes knowledge [14].
On the other hand, Flux CD takes a decentralised approach, running an instance of Flux on each cluster. This setup localises potential misconfigurations to individual clusters, reducing the risk of cascading failures. This model has proven effective at scale. For instance, Deutsche Telekom uses Flux to manage approximately 200 Kubernetes clusters for its 5G core infrastructure with a small team of just 10 engineers [16]:
Using Flux, DT now manages some 200 Kubernetes clusters with just 10 full-time engineers and plans to scale to thousands of clusters without adding more than one or two more members to the infrastructure team.- Vuk Gojnic, Deutsche Telekom [16]
Similarly, Mettle, a London-based finance company, experienced a 50% boost in production speed and a 75% increase in deployments after adopting Flux. They also achieved a Mean Time to Recovery (MTTR) of just 20 minutes across all clusters [16].
GitOps is not a tool choice - it is a discipline. ArgoCD and Flux are both excellent implementations; the decision between them usually comes down to whether you prefer a unified UI-first experience or a composable, CRD-first architecture.- Zak Hassan, Staff SRE [15]
Now that we've covered GitOps controllers, let's look at the workflow orchestrators that enhance multi-cluster GitOps by managing CI/CD pipelines.
Workflow Orchestrators for CI/CD
Workflow orchestrators complement GitOps controllers by handling the procedural aspects of CI/CD pipelines. Here are three tools commonly used in multi-cluster setups:
Argo Workflows: A Kubernetes-native engine designed for complex pipelines. It supports parallel execution, artifact passing between stages, and Directed Acyclic Graph (DAG) workflows. Paired with Argo Events, it can automatically trigger pipelines via Git webhooks [11].
Tekton: A modular framework for building CI/CD systems on Kubernetes. Its reusable Tasks and Pipelines are ideal for scaling across teams and clusters [17].
Apache Airflow: A versatile workflow orchestrator, often used in data engineering. While powerful, it is less suited for Kubernetes-native CI/CD pipelines in multi-cluster environments [11].
Tool Comparison Table
| Feature | Argo CD | Flux CD |
|---|---|---|
| Architecture | Centralised hub-and-spoke [14] | Composable toolkit; decentralised or CAPI-driven [14] |
| User Interface | Comprehensive web UI with visual diffs [14] | CLI-first; relies on external dashboards [14] |
| Multi-Cluster Model | Central management via ApplicationSet [14]
|
Decentralised or CAPI-driven [14] |
| Security Model | Project-based tenancy (AppProject CRD) [14]
|
Kubernetes-native RBAC and impersonation [14] |
| Scalability | High, but central hub is a single point of failure [14] | Extremely high due to the autonomous, decentralised model [14] |
| Ecosystem | Argo Workflows, Argo Events, Argo Rollouts [14] | Flagger, Cluster API (CAPI), Terraform Controller [14] |
| Workflow Orchestrator | Best Fit in Multi-Cluster GitOps |
|---|---|
| Argo Workflows | Kubernetes-native DAG-based CI pipelines and complex automation [11] |
| Tekton | Reusable, modular CI/CD components across teams and clusters [17] |
| Apache Airflow | General-purpose orchestration; less suited to Kubernetes-native CI/CD [11] |
Best Practices for Multi-Cluster GitOps and Orchestration
Building on the earlier discussion of orchestration and GitOps, here are some tried-and-tested strategies for managing a multi-cluster setup efficiently and cost-effectively.
Standardisation and Governance
Before scaling across teams, focus on standardising your platform. Scaling prematurely often leads to inconsistencies that are harder to fix later [3]. A great way to maintain consistency is by using the App of Apps
pattern, which helps bootstrap new clusters with a uniform set of tools. To ensure deployments happen in the right order, make use of Sync Waves and Hooks (PreSync/PostSync) [9][8]. For promoting changes between environments, establish clear approval gates - automated tests and manual reviews in staging should be mandatory before changes make it to production [4][18].
Governance plays a critical role in GitOps workflows, particularly for compliance. Features like immutable audit trails and separation of duties align well with standards such as SOC 2, HIPAA, and ISO 27001 [18]. For emergencies, consider implementing break-glass controls. These provide temporary, time-limited access while recording all actions, ensuring that governance remains intact even under urgent circumstances [18].
Cost Optimisation in Multi-Cluster Environments
Managing costs in a multi-cluster environment can be challenging, but there are ways to keep expenses under control. For non-production clusters, use Kustomize overlays to set leaner resource limits and reduce replica counts. You can also tune reconciliation intervals in tools like Flux CD - for example, setting stable platform components to reconcile every 30 minutes and active applications every 5 minutes. This helps reduce unnecessary controller activity [13][19][20].
For organisations managing over 100 clusters, Git mirrors can help avoid API rate limits and cut down on external traffic costs. Additionally, using spot or preemptible instances for non-critical workloads can significantly lower compute expenses [19].
If you're struggling with rising cloud costs, Hokstad Consulting offers cloud cost engineering services. They claim to deliver savings of 30–50% through audits, rightsizing, and ongoing optimisations, operating on a no-savings, no-fee basis.
Beyond technical measures, defining team roles clearly is essential to maintaining efficiency and governance as you scale.
Team Structures and Responsibilities
In multi-cluster GitOps, organisational design often proves more challenging than the technical setup. Most organisations eventually adopt a structure where responsibilities are clearly divided:
- A platform team manages the central cluster, handles provisioning and bootstrapping, and oversees shared tools like ingress controllers and monitoring agents.
- Application teams focus on their service manifests and environment-specific configurations, making changes via pull requests using self-service GitOps [1][3].
Spotify’s approach to this is a hybrid model. They separate platform infrastructure from team-specific configurations, with ApplicationSets in Argo CD dynamically generating Applications from both sources [5]. Similarly, Capital One has a structured workflow where every production change undergoes 14 automated checks, including PR reviews and Kyverno policy validation. This setup has reduced their audit preparation time from 6 weeks to just 3 days [5].
| Role | Primary Responsibility | Key Tools |
|---|---|---|
| Platform Team | Cluster provisioning, bootstrapping, shared add-ons | Flux, Crossplane, Terraform |
| Application Team | Service manifests, environment-specific overlays | Kustomize, Helm, Argo CD |
| SRE Team | Management cluster health, global RBAC, NetworkPolicies | Argo CD, Prometheus, Thanos |
| Security/Governance | Policy enforcement, audit trails, secret management | Kyverno, OPA, External Secrets |
Conclusion
Managing Kubernetes environments becomes far less daunting with multi-cluster GitOps and workflow orchestration. By combining declarative configurations, Git-based management, and automated reconciliation, these practices remove much of the manual effort typically involved in such tasks.
The results speak for themselves. UKi cut deployment times by an impressive 97% and achieved a proof of concept in just 2.5 months [21]. Dr. Scott Wells, Co-founder of UKi, summed it up perfectly:
I initially thought we were 3 years out of being able to build this ourselves because of the complexity. I was amazed when I realized how quickly things were moving along.[21]
Hong Wang, CEO of Akuity, also highlighted the growing importance of GitOps in today’s tech landscape:
In the AI era, GitOps becomes more foundational. Everything will be code, everything will be declarative, and everything will be managed that way.[2]
To fully realise the potential of multi-cluster GitOps, organisations should focus on a few key practices: standardising patterns before scaling, embedding governance directly into pipelines, and prioritising secrets and policy enforcement early on. The most successful teams invest in building strong platform foundations before attempting to scale, ensuring they can sustain these benefits over time.
For those ready to adopt or refine their GitOps multi-cluster approach, Hokstad Consulting provides a range of DevOps transformation services. From CI/CD pipeline design to cloud cost optimisation and custom automation, they offer a straightforward, results-driven approach to help organisations succeed.
FAQs
How do I choose push vs pull GitOps for multiple clusters?
Choosing between push and pull GitOps largely depends on whether you prioritise centralised control or resilience.
In a push model, a central orchestrator takes charge, ensuring compliance and enforcing policies across the board. While this approach provides a strong grip on operations, it can also introduce a single point of failure and may face challenges in environments with edge devices or restricted networks.
On the other hand, a pull model relies on cluster-local controllers. This setup enhances fault isolation, reduces latency, and simplifies network requirements, making it a solid choice for distributed systems or constrained networks.
Many organisations opt for a hybrid approach, combining the strengths of both models to maintain a balance between global oversight and localised execution.
When should I use Argo CD vs Flux in a multi-cluster setup?
Pick Argo CD if you're looking for a centralised control plane with a straightforward web interface and built-in tools such as Argo Rollouts. It's a great choice for overseeing deployments across multiple clusters.
Choose Flux if you prefer a decentralised, Kubernetes-native setup. It's well-suited for edge deployments, smaller failure zones, and workflows centred around CLI or APIs. Some teams even use a mix of both, leveraging Flux for infrastructure management and Argo CD for deploying applications.
What’s the best way to promote changes safely across clusters?
To promote changes across clusters safely, adopt a GitOps workflow. In this setup, a Git repository serves as the unchanging source of truth. This approach ensures that all changes are version-controlled, peer-reviewed through pull requests, and auditable before deployment.
When it comes to progressive delivery, tools like Argo Rollouts or Flagger come in handy. These tools support methods such as canary deployments or blue-green deployments, where updates are gradually tested on a small portion of traffic. This allows for close performance monitoring and enables automatic rollbacks if any issues arise.