Best Practices for Vulnerability Scanning in Hybrid Clouds | Hokstad Consulting

Best Practices for Vulnerability Scanning in Hybrid Clouds

Best Practices for Vulnerability Scanning in Hybrid Clouds

Hybrid cloud environments combine on-premises systems with public cloud platforms, offering flexibility but also introducing security challenges. Vulnerability scanning is critical to identify weaknesses, misconfigurations, and threats in these complex setups. Here's what you need to know:

  • Hybrid Clouds Are Riskier: They face 23% higher breach risks, with 99% of failures caused by misconfigurations.
  • Key Challenges: Fragmented visibility, inconsistent security policies, legacy system integration, and identity sprawl make management difficult.
  • Effective Scanning Tools: Use a mix of agent-based (deep insights) and agentless (quick, broad scans) tools. For containers, tools like Trivy and Grype integrate well into CI/CD pipelines.
  • Best Practices:
    • Automate continuous discovery and scanning.
    • Prioritise vulnerabilities using CVSS and EPSS scores.
    • Integrate scanning into development pipelines.
    • Enforce strict access controls (least privilege and IAM).
  • Compliance: Follow frameworks like Cyber Essentials, NIST, or ISO 27001 to meet UK and global standards.
  • AI and Automation: AI-driven tools cut false positives and predict exploitation likelihood, reducing remediation times from weeks to days.

Hybrid cloud security requires a structured, proactive approach to stay ahead of threats, reduce costs, and maintain compliance.

Break Down Silos: Unify Hybrid Cloud and On-Prem Risk Visibility with Tenable One

Tenable One

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Challenges and Risks in Hybrid Cloud Vulnerabilities

::: @figure Cloud vs On-Premises Vulnerability Comparison for Hybrid Environments{Cloud vs On-Premises Vulnerability Comparison for Hybrid Environments} :::

Hybrid cloud environments bring a unique set of security challenges that aren't present in purely on-premises or cloud-only setups. One of the biggest issues is fragmented visibility. Security teams often find themselves juggling separate logs from cloud providers and on-premises tools, leading to blind spots. A staggering 67% of organisations report facing significant network blind spots in their hybrid infrastructure. On top of that, 82% of security breaches are tied to human error, often due to the complexity of managing these mixed environments.

The mix of legacy systems and modern cloud solutions only adds to the difficulty. These older systems often lack contemporary authentication methods, forcing the use of insecure workarounds to connect with cloud services. The ease of rapid, one-click cloud provisioning can bypass the stricter change management processes typical of on-premises systems. This opens the door to risks like exposed S3 buckets, over-permissioned APIs, and misconfigured resources - prime targets for attackers.

This highlights the crucial need for robust monitoring:

The best time to take hybrid cloud monitoring seriously was yesterday; the second-best time is now. Ignoring or delaying a solid monitoring foundation will only lead to performance issues, outages, downtime, and security risks. - Ifeanyi Benedict Iheagwara, Data Analyst and Machine Learning Engineer

Inconsistent Security Policies Across Cloud and On-Premises

Hybrid environments often suffer from mismatched security policies. For example, on-premises networks may rely on IP-based access controls, while cloud services use credential-based authentication. This mismatch can leave gaps in protection. On-premises security tools frequently fail to track events across cloud platforms, leaving temporary assets like test instances unscanned. These inconsistencies create vulnerabilities, such as mismatched encryption standards or authentication methods, which attackers can exploit.

The shared responsibility model adds further confusion. Many organisations mistakenly believe that cloud providers secure everything. In reality, providers are only responsible for the infrastructure of the cloud. It’s up to organisations to secure their data, applications, and identity management in the cloud. This misunderstanding can lead to policy gaps, with 52% of UK organisations struggling to maintain consistent regulatory compliance across hybrid setups.

Identity and credential sprawl is another significant issue. Managing user identities across multiple platforms often results in orphaned accounts - accounts active in one system but deactivated in another - or reused credentials. Alarmingly, 28% of unauthorised cloud access stems from compromised credentials. A notable example occurred in September 2024, when Microsoft’s security team uncovered the Storm-0501 threat group exploiting weak credentials and excessive privileges in on-premises systems. They used these to move laterally into cloud environments, stealing credentials and deploying ransomware.

Complexity of Hybrid Cloud Infrastructure

Hybrid cloud environments create a tangled web of entry points that traditional security models struggle to handle. Cloud assets can exist for only seconds or days, while on-premises infrastructure often persists for months or even years.

Here’s a quick comparison of vulnerabilities between cloud and on-premises setups:

Factor Cloud Vulnerabilities On-Premises Vulnerabilities
Primary Threat Misconfigurations (S3 buckets, APIs) Unpatched software and outdated firmware
Access Point API exposure and identity sprawl Open network ports and legacy protocols
Asset Lifecycle Short-lived (seconds to days) Persistent (months to years)
Security Model Shared Responsibility Full organisational control

The complexity of managing these environments is further compounded by a lack of expertise. 73% of UK organisations report a significant skills gap in cloud management, making it harder to secure hybrid setups effectively. Without the right skills, teams struggle to implement critical measures like micro-segmentation, automated discovery, or centralised logging. This leaves hybrid environments vulnerable to lateral movement by attackers.

Third-Party and Supply Chain Risks

Integrating third-party tools and services into hybrid cloud environments introduces additional risks. Connecting legacy systems to modern cloud services often involves middleware or custom integrations. These connections can create security gaps that neither the on-premises team nor the cloud provider fully monitors.

Security is always going to cost you more if you delay things and try to do it later. The cost is not only from the money perspective but also from time and resource perspective. - Ayman Elsawah, vCISO, Sprinto

A practical example of tackling these risks comes from a major UK financial services provider. In 2023, they implemented automated network monitoring using Prometheus and conducted quarterly audits with Ansible to secure their hybrid cloud. Led by their Head of IT Security, this effort reduced unauthorised access incidents by 42% and improved their audit compliance score from 78% to 96% in just 12 months. This case shows that while hybrid cloud environments are complex, systematic approaches to visibility and policy enforcement can significantly reduce risks.

Tools for Vulnerability Scanning

Hybrid cloud environments demand a mix of agent-based and agentless tools to ensure thorough vulnerability scanning while avoiding unnecessary strain on systems.

Agentless scanners operate via cloud provider APIs and disk snapshots, examining resources externally without requiring software installation on target systems. These tools are particularly useful for short-lived workloads where traditional agents aren't practical. On the other hand, agent-based scanners require software installation on each host, offering deep, real-time insights into processes, file integrity, and network activity - essential for scenarios needing immediate threat detection.

Every approach has its pros and cons. I will never advocate a single approach for all use cases because it really comes down to 'What is the best approach for acquiring vulnerability data within a given use case?' - Ryan Bragg, Senior Cloud Security Engineer, Tenable [1]

Performance Considerations

Agentless scanning has a major advantage: no resource overhead on production systems, since all processing happens externally. However, agent-based scanning can impact performance, so Tenable suggests limiting scans to 1,000 hosts at a time to avoid overloading systems [2].

Agent-Based vs Agentless Scanners

The differences between these two approaches become clearer when looking at factors like speed, resource usage, and compatibility:

Criteria Agent-Based Scanners Agentless Scanners
Deployment Speed Takes days to weeks for a full rollout Ready in minutes via API/IAM roles
Coverage Depth Provides detailed runtime visibility Focuses on configurations and at-rest files
Resource Overhead Consumes host CPU and memory No impact on running workloads
Hybrid Compatibility Ideal for in-depth runtime insights Best suited for cloud-native and ephemeral resources
Blocking Capability Can terminate processes Limited to read-only access

A good strategy involves starting with agentless tools to quickly scan all cloud resources. Then, deploy agents selectively on critical systems - like production databases or sensitive data platforms - where detailed monitoring is essential. This layered approach ensures a balance between broad coverage and focused inspection.

Agentless tools also shine when monitoring stopped resources, such as inactive virtual machines or static container images that agent-based scanners can't reach. Additionally, they reduce risks tied to managing OS-level credentials (e.g., SSH or WinRM), but they can't detect in-memory attacks or live behavioural anomalies like unexpected shell spawning from a database pod.

For containerised environments, specialised tools are available to address the challenges of dynamic workloads in CI/CD pipelines.

Container and Image Scanners

Containers in hybrid clouds bring unique security challenges, requiring tools that integrate directly into CI/CD pipelines. Trivy, developed by Aqua Security, is a standout option, scanning container images, Kubernetes clusters, and cloud resources for CVEs, misconfigurations, and exposed secrets - all in one tool.

Trivy was a clear leader in the market as far as features, functionality, and capabilities - Sam White, GitLab [3]

Other tools, such as Grype and Syft from Anchore, take a pipeline-focused approach. Syft generates Software Bills of Materials (SBOMs), which Grype uses to identify vulnerabilities. This allows for efficient auditing of images at deployment without needing repeated scans. Clair, designed for registry-native analysis, continuously evaluates images stored in registries like Quay [4][5]. For runtime protection, Falco monitors system calls, catching threats like unexpected network connections or shell spawning that static scans might miss [4][6].

To cover all bases, it's wise to combine tools. Running both Trivy and Grype in CI pipelines is effective, as their vulnerability databases differ [4]. Meanwhile, Checkov can assess Kubernetes manifests and Dockerfiles against thousands of built-in policies [4]. In production, agentless scanning ensures broad registry coverage, while runtime tools like Falco detect drift - when binaries are executed that weren't part of the build.

Since container lifecycles are often measured in minutes, vulnerabilities can spread quickly if not addressed early [6]. By integrating these tools into your pipeline, you can block non-compliant builds before they reach production, shifting security earlier in the process where fixes are faster and less costly.

Best Practices for Vulnerability Scanning in Hybrid Clouds

Managing vulnerabilities in hybrid cloud environments requires a well-organised approach that blends automation, prioritisation, and seamless integration across diverse systems.

Automate Continuous Discovery and Scanning

Using automated discovery tools helps keep track of your infrastructure by continuously mapping it and identifying new assets as they emerge. This ensures that everything is monitored in real time, which is crucial for spotting vulnerabilities before attackers do.

Incorporate tools like SAST (Static Application Security Testing), SCA (Software Composition Analysis), DAST (Dynamic Application Security Testing), and container scanning during the right stages of development. This approach not only catches vulnerabilities early but also lowers the cost and effort of fixing them.

Prioritise Vulnerabilities with EPSS and CVSS

Once you've established continuous discovery, the next step is to prioritise vulnerabilities so your team can focus on the most pressing threats.

Not every vulnerability demands immediate attention. For example, a critical-severity CVE in a test environment is far less concerning than a medium-severity flaw in an internet-facing production database. Effective prioritisation involves combining several factors to ensure resources are directed where they’re needed most.

CVSS (Common Vulnerability Scoring System) provides a consistent way to measure the severity of vulnerabilities. However, it doesn’t indicate whether a vulnerability is actively being exploited. That’s where EPSS (Exploit Prediction Scoring System) comes in - it estimates the likelihood of exploitation within 30 days, helping teams address immediate, real-world risks. Additionally, evaluating the criticality of assets ensures that high-value systems, such as payment platforms or customer databases, receive top priority. Attack exposure scores further refine prioritisation by highlighting complex risks, such as vulnerabilities that could be chained together to bypass static scans.

Prioritisation Factor Description Impact on Remediation
CVSS Score Measures raw vulnerability severity Establishes a baseline for technical risk
Asset Criticality Assesses the importance or sensitivity of the resource Ensures key systems are prioritised
Exploitability (EPSS) Predicts the likelihood of exploitation in 30 days Focuses efforts on immediate threats
Attack Exposure Score Identifies risks from attack paths and combinations Reveals complex risks not caught by static scans

After prioritising vulnerabilities, integrating these processes into your CI/CD pipeline ensures issues are addressed promptly and efficiently.

Integrate Vulnerability Scanning into CI/CD Pipelines

Delaying vulnerability detection until production is both risky and expensive. By embedding scans into your CI/CD pipelines, you can shift security efforts to earlier stages of development, where problems are easier to fix. This approach also allows you to automatically block builds that don’t meet security standards, preventing vulnerable code from reaching production.

Tools like Trivy and Grype are particularly effective for scanning container images within CI/CD workflows. Extending this practice to all code types ensures that security checks are consistently applied across your entire development process. If a critical vulnerability is flagged, the pipeline can fail automatically, notifying developers immediately while the issue is still fresh in their minds.

Early detection is essential, but limiting access is another key step in mitigating risks from any vulnerabilities that remain.

Enforce Least Privilege and IAM Controls

While vulnerability scanning uncovers weaknesses, strong access controls can limit the damage if those weaknesses are exploited. Implementing least privilege principles ensures that users and services only have the permissions they absolutely need, reducing the potential impact of a breach. In hybrid cloud environments, it’s important to synchronise access controls across both cloud and on-premises systems to avoid inconsistencies.

Role-based access control (RBAC) adds another layer of security by limiting who can deploy resources, modify configurations, or access sensitive data. For instance, even if a containerised application has a vulnerability, properly scoped IAM (Identity and Access Management) policies can prevent attackers from moving laterally within your environment. Regular automated audits are also vital for preventing privilege creep, where accounts gradually accumulate unnecessary permissions over time.

Compliance Considerations for Vulnerability Management

To meet compliance requirements, organisations need a structured approach to vulnerability management that not only protects their systems but also shows clear accountability. For UK businesses operating in hybrid cloud environments, several frameworks provide guidance on handling vulnerabilities effectively.

Cyber Essentials is particularly relevant for UK organisations, especially those working with the public sector. This government-backed programme requires that all critical or high-security updates be applied within 14 days of their release [7]. The urgency of this requirement is underscored by the sharp drop in the time-to-exploit for vulnerabilities - from 32 days in 2021/22 to just 5 days in 2023 [7]. To meet this tight window, organisations must implement robust processes and automation.

To highlight the importance of timely patching, consider the 2017 Equifax breach. The company identified a critical vulnerability but delayed applying the necessary patch. The result? A breach that impacted 143 million individuals and cost over £1.1 billion in recovery and reputational damage. UK compliance experts often reference this case to stress the importance of adhering to NIST-aligned Respond functions [7].

NIST, CIS Benchmarks, and ISO 27001

NIST

Different compliance frameworks focus on various aspects of vulnerability management. Understanding these differences allows organisations to adopt the approach that best suits their needs.

Framework Focus Areas Audit Frequency Remediation Timelines
NIST 800-53 Extensive catalogue of controls (1,000+), ideal for federal and enterprise systems Periodic or continuous monitoring Risk-based; prioritises Flaw Remediation (SI-2)
CIS Controls v8 18 actionable safeguards, including continuous vulnerability management (Control 7) Continuous scanning and tracking Tailored to Implementation Groups; focuses on critical assets
Cyber Essentials Basic cyber hygiene and perimeter security specific to the UK Annual certification 14 days for critical updates
ISO 27001 Information Security Management System (ISMS) framework Biannual reviews Risk-based, defined by the organisation

These frameworks highlight the importance of continuous vulnerability scanning as a cornerstone of hybrid cloud security. For instance, CIS Controls v8 mandates automated scanning and remediation tracking under Control 7 [8], ensuring compliance from the start.

AI and Automation for Vulnerability Management

Building on earlier discussions about improving scanning processes, AI and automation are reshaping how organisations handle vulnerability management. Traditional scanning methods struggle to keep pace with the complexity of hybrid cloud environments. Security teams are often overwhelmed by thousands of alerts each day, many of which turn out to be false positives or low-priority issues. This noise makes it harder to focus on genuine threats. AI-driven tools help tackle this by analysing patterns across massive datasets, learning to identify actual risks in a specific environment, and cutting false positives by up to 90% through contextual analysis[10]. This evolution towards AI-powered approaches enables more proactive and effective vulnerability detection.

AI-Driven Scanning and Predictive Analysis

Machine learning takes vulnerability scanning beyond the limitations of rule-based systems. By analysing data from sources like security feeds, configuration databases, and network logs, AI can detect anomalies across both on-premises and cloud environments. It also predicts which vulnerabilities are most likely to be exploited by studying historical breach data[9].

For example, the Exploit Prediction Scoring System (EPSS), when paired with AI, can predict exploit likelihood with an accuracy of 82%, far surpassing the insights provided by CVSS scores alone[13]. According to the Ponemon Institute's 2024 research, organisations using AI-driven vulnerability management tools report an average remediation time of just 48 hours, compared to 18 days for teams relying on manual processes[11]. Gartner anticipates that by 2026, 75% of enterprises will adopt AI for predictive vulnerability analytics, a significant jump from 20% in 2023[12].

AI also learns the specific characteristics of your infrastructure. It can assess which systems are critical, which are exposed to external risks, and which vulnerabilities pose the most significant threats in context. For instance, a vulnerability with a high EPSS score on a non-critical development system might be deprioritised, while a lower-scored issue on a production database server could be flagged as urgent. These distinctions are handled automatically, ensuring resources are focused where they matter most.

In addition to identification and prediction, automation plays a crucial role in speeding up the remediation process.

Automating Remediation with Custom Solutions

Automation shifts vulnerability management from a reactive to a proactive approach, particularly in the complex landscape of hybrid clouds. Coordinating remediation across diverse environments can be challenging - whether it’s patching cloud instances, managing on-premises change processes, or applying fixes to containers and virtual machines simultaneously.

Hokstad Consulting offers tailored automation solutions to address these challenges. Their custom workflows streamline remediation across entire infrastructures, integrating seamlessly with existing CI/CD pipelines and patch management systems. For example, automation can simultaneously create tickets for on-premises systems and trigger cloud-native remediation APIs, ensuring consistent tracking and compliance. These workflows are designed to align with your current processes, whether that involves fully automated patching for low-risk vulnerabilities or staged updates tested in non-production environments first.

The success of automated remediation depends on clear policies that define which vulnerabilities can be addressed automatically and which require human oversight. Routine, low-risk fixes can be handled by automation, allowing security teams to focus on more complex decisions. Comprehensive logging ensures audit readiness, while automatic rollback capabilities add a layer of safety. By combining automation with expertise in DevOps practices, organisations can significantly reduce remediation times while maintaining the oversight needed for compliance and effective risk management.

Conclusion

The discussion above highlights the pressing need for a strong, forward-thinking scanning strategy in hybrid cloud environments. Managing assets across on-premises data centres, public clouds, and containers presents challenges that traditional security methods simply can't handle. By adopting proactive scanning, organisations can shift from reactive responses to continuous, real-time threat detection. This approach helps uncover vulnerabilities across networks, endpoints, applications, and serverless functions before attackers can take advantage.

The numbers paint a stark picture: global cyberattack costs are projected to hit £7.6 trillion in 2024. With figures like these, automation and AI-powered scanning are no longer optional - they’re critical for scaling security measures while easing the strain on already stretched security teams.

But the advantages go beyond just spotting threats. Continuous scanning also supports compliance efforts by generating audit-ready reports for standards like UK GDPR, ISO 27001, and PCI DSS. Addressing vulnerabilities early in the development cycle - using shift-left security practices - proves far more cost-effective than dealing with issues post-deployment or, worse, after a breach. Delayed action only leads to higher costs in time, resources, and financial losses.

To succeed, organisations need to embed vulnerability scanning into their CI/CD pipelines, automate asset discovery, and prioritise fixes based on context rather than just raw CVSS scores. Focusing on vulnerabilities with active exploits (leveraging EPSS) that impact critical production systems is key. It's equally important to confirm fixes with immediate re-scanning and track metrics like Mean Time to Remediate (MTTR) to gauge progress. These steps align with earlier discussions on integrating AI tools and automation into workflows.

Hokstad Consulting offers bespoke automation solutions designed to streamline these processes across various infrastructures. Their approach ensures consistent security enforcement and reduces remediation times from weeks to days. These strategies not only protect critical systems but also allow organisations to maintain the agility that hybrid cloud environments promise.

As hybrid cloud security continues to evolve, identity management is becoming the cornerstone of defence, replacing traditional network boundaries. By embracing automation, AI-driven detection, and risk-based prioritisation, organisations can stay ahead of emerging threats in this ever-changing landscape.

FAQs

How do I choose between agent-based and agentless scanning in a hybrid cloud?

When deciding between agent-based and agentless scanning, it largely comes down to your specific security requirements and the structure of your cloud environment.

  • Agentless scanning relies on APIs or remote methods to perform quick, non-intrusive scans. This approach works well for environments with a mix of systems or when running checks before deployment.

  • Agent-based scanning, on the other hand, uses lightweight agents installed on systems to provide continuous monitoring and more detailed runtime insights. This makes it a better choice for environments that are highly dynamic or handle sensitive data.

Many organisations find that combining both approaches offers the best results, especially for hybrid cloud setups where comprehensive coverage is essential.

How often should I scan when workloads are short-lived (containers and cloud VMs)?

For workloads with short lifespans, such as containers and cloud VMs, it's crucial to run vulnerability scans either continuously or at least once a day. These assets can rapidly turn into security risks because cloud vulnerabilities evolve so quickly. Frequent scanning helps catch and address potential threats promptly.

What’s the best way to prioritise fixes using CVSS, EPSS and asset criticality?

To address fixes effectively, evaluate vulnerabilities by considering their severity (CVSS), exploitability (EPSS), and the importance of the affected assets. Factor in threat intelligence and business risks to ensure efforts are directed towards the most pressing issues. This approach helps align remediation efforts with broader risk management goals.