What Are Spot Instances and Preemptible VMs?

Spot Instances and Preemptible VMs are temporary cloud computing resources offered at significantly reduced costs. These options are ideal for tasks that can handle interruptions, such as batch processing, data analysis, or development environments. However, they are unsuitable for applications requiring guaranteed uptime. Here's a quick breakdown:

Spot Instances (AWS): Bid on unused capacity, with savings up to 90%. Interruptions occur unpredictably based on demand, with a 2-minute warning.
Preemptible VMs (Google Cloud): Fixed 24-hour runtime at discounts of 60-80%. Predictable termination simplifies scheduling.

Both options reduce costs but require workloads to tolerate interruptions. Spot Instances offer greater savings but demand more complex management, while Preemptible VMs provide predictable planning. Choose based on your workload's flexibility.

Understanding Spot VMs

1. Spot Instances

Spot instances offer a way to access unused server capacity at heavily discounted rates. Instead of paying standard prices for guaranteed resources, organisations can bid on idle computing power that cloud providers aren't using.

Pricing Model

The cost of spot instances is driven by market demand. When demand for regular instances is low, spot prices can drop by 60–90% compared to on-demand rates. However, these prices aren't fixed - they fluctuate based on supply and demand in each availability zone.

Users specify the highest price they're willing to pay per hour, creating a bidding system. This approach makes spot instances an appealing option for cost-sensitive workloads that can handle some uncertainty in pricing.

That said, the variability in pricing ties directly to challenges with availability.

Availability and Duration Limits

Spot instances come with availability limitations. Their existence depends on surplus capacity in the provider's infrastructure, which means their availability can vary widely depending on overall demand.

The duration of spot instances is unpredictable. They might run for just a few minutes or continue for several days, depending on market conditions. Since there's no guaranteed runtime, applications using spot instances need to be designed to handle frequent interruptions.

Some instance types and regions offer better availability than others. For instance, newer, high-performance instance types often have limited spot availability due to higher demand for standard workloads.

Eviction Policies

Eviction is a major consideration when using spot instances. Providers can terminate these instances when demand rises. Typically, you'll receive a short warning - around two minutes - before the instance is shut down.

During this brief warning period, applications can attempt to save progress, transfer data to persistent storage, or shut down gracefully. However, two minutes is often insufficient for complex operations.

Evictions occur for several reasons: when the spot price surpasses your bid, when the provider reallocates capacity to on-demand customers, or during infrastructure maintenance. Once an eviction begins, it cannot be delayed or reversed.

Workload Suitability

Spot instances are best suited for stateless applications that can restart without issues. Ideal use cases include batch processing jobs, data analysis tasks, and rendering workflows - these can either resume from checkpoints or restart entirely with minimal impact.

Development and testing environments are another great fit for spot instances. Since these workloads don't require continuous uptime, the cost savings allow teams to scale their environments without overspending.

Similarly, CI/CD pipelines benefit from spot instances, especially for build and test stages that have predictable durations. The key is ensuring that interruptions don't disrupt the build process or result in lost data.

On the other hand, database servers, live web applications, and real-time processing systems should avoid spot instances. These workloads rely on consistent availability, and unexpected shutdowns could lead to downtime, data loss, or user dissatisfaction.

Selecting the right workloads for spot instances is essential to achieving cost savings without compromising performance. For organisations aiming to optimise their cloud spending, Hokstad Consulting provides expert guidance on identifying suitable workloads and integrating spot instances into a resilient cloud strategy. Proper architectural planning is critical to maximising the benefits of spot instances while maintaining operational stability.

2. Preemptible VMs

Unlike spot instances, which depend on bidding dynamics, preemptible VMs offer a predictable runtime, making it easier to plan workloads. These VMs take advantage of surplus capacity, offering significantly reduced costs compared to standard VMs [1][3]. This pricing model allows cloud providers to pass savings from unused resources directly to users.

Pricing Model

Descartes Labs CEO Mark Johnson highlights the impact of this pricing approach:

The Preemptible VM pricing model is a game changer for a seed-funded startup like ours, because of the significant cost reduction. We're excited to continue using them in the future as we increase the amount of data we process to identify and determine the health of global crops. [4]

While charges for premium operating systems remain unchanged, discounts apply to local SSDs and GPUs [1]. The fixed runtime of preemptible VMs further simplifies task scheduling, distinguishing them from spot instances.

Availability and Duration Limits

Preemptible VMs come with a maximum runtime of 24 hours, after which they are automatically terminated. This limitation makes them suitable for specific tasks like batch jobs and data processing, where uninterrupted availability isn't critical. By contrast, standard VMs provide continuous uptime supported by Service Level Agreements (SLAs) [1][2][3].

Workload Suitability

The 24-hour runtime is ideal for workloads that can be completed within this timeframe. These VMs are particularly well-suited for batch processing, data analysis, and other tasks where minor interruptions are manageable. Their predictable runtime aligns with cost-saving strategies, similar to those used for spot instances.

For organisations aiming to reduce cloud infrastructure costs without sacrificing operational efficiency, preemptible VMs offer a practical solution. Hokstad Consulting provides expertise in cloud cost engineering, helping businesses create tailored strategies that maximise savings while meeting operational needs.

Need help optimizing your cloud costs?

Get expert advice on how to reduce your cloud expenses without sacrificing performance.

Schedule a 30 minutes, no-obligation call

Benefits and Drawbacks

Spot instances and preemptible VMs can significantly reduce costs, but they come with distinct limitations. Knowing these trade-offs is essential for shaping an effective cloud infrastructure strategy.

The standout advantage is cost savings. Spot instances can slash costs by up to 90% compared to on-demand pricing, though the actual savings depend on market demand. Preemptible VMs offer a more predictable discount of 60-80% and include a fixed 24-hour runtime, which simplifies scheduling and planning.

Interruption risk is a key difference. Spot instances are subject to market-driven interruptions, which can disrupt workflows unexpectedly. In contrast, preemptible VMs provide a more predictable environment, with the assurance of a maximum 24-hour runtime. This fixed limit enables teams to plan workloads more effectively.

Integration complexity also varies. Spot instances demand advanced strategies, such as bidding and real-time price monitoring, making their implementation more challenging. Preemptible VMs are simpler to integrate, requiring a focus on scheduling workloads within the 24-hour constraint rather than navigating fluctuating market dynamics.

Here’s a quick comparison of the two:

Aspect	Spot Instances	Preemptible VMs
Cost Savings	Up to 90% (variable)	60-80% (consistent)
Interruption Risk	Market-driven, unpredictable	Predictable 24-hour limit
Integration Complexity	High (bidding strategies needed)	Moderate (scheduling focus)
Workload Suitability	Fault-tolerant, flexible timing	Batch jobs, time-bounded tasks
Planning Difficulty	Complex price forecasting	Straightforward time management

Spot instances are ideal for applications that can handle unexpected interruptions, such as web crawling, image processing, or development environments. On the other hand, preemptible VMs work well for batch-processing tasks, data analysis, or computational workloads that can be completed within the 24-hour timeframe.

The operational overhead also differs. Spot instances require continuous market monitoring and automated failover systems to handle interruptions. Preemptible VMs, meanwhile, depend on efficient scheduling to maximise their fixed runtime.

Choosing between these two options depends on your workload’s flexibility and tolerance for interruptions. Hokstad Consulting can help refine this decision with tailored strategies to optimise cloud costs and performance.

Conclusion

Spot instances and preemptible VMs present an opportunity to tap into unused cloud capacity at a fraction of the usual cost. The decision between the two hinges on how well your workloads can handle interruptions and your specific operational needs.

Spot instances are ideal for organisations aiming to push cost savings to the limit. However, they demand a solid setup, including constant monitoring and automated failover systems, to handle unexpected terminations effectively. On the other hand, preemptible VMs, while also prone to termination, offer a more predictable option for workloads that can manage planned interruptions, often simplifying their management.

For UK businesses, the key lies in finding the right balance between cost efficiency and operational stability. Companies with highly resilient systems can benefit from the greater savings of spot instances, while those with steadier workflows may lean towards the reliability of preemptible VMs.

FAQs

How do I know if Spot Instances or Preemptible VMs are right for my workload?

Spot Instances and Preemptible VMs work best for tasks that are interruption-friendly and can adjust to flexible timing. They're a great fit for workloads like batch processing, data analysis, machine learning training, or any job that can be paused or restarted without causing major disruptions.

On the other hand, if your workload is stateful, demands constant availability, or cannot handle unexpected interruptions, these instances might not be the right choice. It’s important to thoroughly assess your workload's needs and think about how interruptions could impact your operations before deciding to use them.

How can I reduce the risks of using Spot Instances or Preemptible VMs?

To reduce the risks tied to Spot Instances and Preemptible VMs, it's crucial to build fault-tolerant workloads that can adapt seamlessly to interruptions. A smart way to balance reliability and cost savings is by combining Spot Instances with on-demand instances.

You can minimise disruptions even further by using a variety of instance types and distributing workloads across multiple availability zones. Adding automation to handle preemption notices - like scaling or migrating workloads as needed - can also keep things running smoothly. These approaches help you save on costs without sacrificing performance.

How does the Spot Instance pricing work, and what should you consider before using them?

Spot Instances offer a budget-friendly way to utilise unused cloud capacity, with prices that adjust dynamically based on supply and demand. The days of setting manual bid prices are gone - instances are now automatically allocated as long as the current spot price fits within your budget.

When using Spot Instances, keep these considerations in mind:

Check historical pricing trends to get a sense of potential costs.
Assess how well your application handles interruptions, since Spot Instances can be terminated when demand spikes.
Design for scalability and fault tolerance, so critical operations remain unaffected even if instances are interrupted.

These instances work perfectly for tasks like batch processing or testing, where occasional interruptions are acceptable and saving money is a key objective.