What is container-based autoscaling?
Whether you’re using a managed Kubernetes service like AWS EKS, GCP GKE, or Azure AKS, or self-managing a DIY cluster deployed with open source tools like kops and Kubespray, the underlying hardware may vary from container to container. . Each container requires specific resources (CPU/memory/GPU/network/disk) and as long as the underlying infrastructure can provide these resources, the container will be able to execute its business logic.
In practice, when planning containers, Kubernetes checks that their requirements are met, but it does not consider the size, type, or price of the instances a container is running on, only that it has them. enough to do it. Often, users over-provision infrastructure to ensure applications stay operational, but this leads to huge cloud waste and unnecessary expense.
Container-driven autoscaling is an approach that aims to reduce overprovisioning and enable high performance by providing containers with the most optimized infrastructure possible, based on specified application-level requirements and constraints. Container and Pod characteristics such as labels, rejections, and tolerances define the type of instance it is associated with.
The importance of choosing the right type of machine
Container orchestration platforms such as Kubernetes require users to manage scaling of the underlying infrastructure (e.g. adding/removing nodes from the cluster) as well as pods at the workload level. Kubernetes offers pod scaling services to add and remove containers (horizontal pod autoscaler) and change resources for specific containers (vertical pod autoscaler), but does not have scaling capabilities. scaling or native infrastructure management.
Managing the infrastructure of a Kubernetes cluster is a difficult task and often requires supporting multiple machine types with various hardware features to meet application needs. To do this efficiently, users end up managing multiple groups of nodes that each support a machine type (or multiple types with similar hardware capability). Each application has its own requirements for instance shapes and sizes, and as the number of node groups increases, so does the amount of wasted infrastructure resources.
Learn more about this challenge and explore examples of how choosing the right type of machine could lead to dramatic cost reductions in the following areas blog post.
How does Ocean work?
Ocean, Spot by NetApp’s serverless infrastructure engine, takes the container-based scaling approach and allows users to support an unlimited number of machine types and sizes in a cluster. knots. This considerably simplifies the management of the infrastructure and results in a substantial reduction in costs. By monitoring events at the container level, Ocean can automatically scale the right size and type of infrastructure to meet application requirements, at the lowest possible cost.
In this article, we will explain how Ocean by Spot implements the Container Driven Autoscaling framework.
Ocean integrates with the Kubernetes cluster using a controller pod, which communicates with the Kubernetes API on one side and the Ocean SaaS backend on the other.
The Controller retrieves metadata from Kubernetes resources deployed on the cluster (e.g. Pods, Nodes, Deployments, DaemonSets, Jobs, StatefulSets, PersistentVolumes, PersistentVolumeClaims, etc.) and sends them to SaaS Ocean. Ocean SaaS analyzes real-time workload requirements and makes smart decisions to scale up and/or scale down the cluster.
In fact, the Ocean autoscaler constantly simulates the actions of the Kubernetes scheduler and will act accordingly to satisfy all Kubernetes resource needs.
Ocean supports the following Kubernetes configurations:
- Resource requests (CPU, memory and GPU)
- Affinity and anti-affinity rules required
- Accidentals and tolerances
- Well-known labels, annotations and stains
- Spot Proprietary Labels and Defects
- cluster-autoscaler.kubernetes.io/safe-to-evict: fake label
- Pod Disruption Budgets
- PersistentVolumes and PersistentVolumeClaims
- Pod Topology Propagation Constraints
If needed, Ocean will decide to increase the capacity of the cluster worker node by adding new nodes to the cluster.
There are two main reasons for a scaling event:
- Pending pods that the Kubernetes scheduler cannot place on existing nodes
- H missingsalon
- The controller pod sends pending pod/s metadata as soon as it appears at the cluster level.
- Every minute, Ocean determines if there are any pending pods and checks the slack to ensure it is in the desired state.
- Analyze the resources needed: For each pending pod, check the requested resources and its set constraints. Next, identify the appropriate Virtual Node Group (VNG) that can host the Pod (in terms of supported node labels, rejections/tolerances, and instance types). Finally, a request for nodes is triggered against the instance selection process while specifying a list of supported machine types and sizes.
Instance selection: this process is responsible for maximizing profitability. It prioritizes the selection of instances in the following order:
Unused Reserved Instances (RI) or Savings Plans (SP).
- Spot Instances using Spot.io’s unique Spot Instance selection process that incorporates market rating, cost, and current distribution of instances in the cluster.
- In the event of a lack of availability of Spot Instances, the system would automatically fall back to On-Demand (OD) Instances and provide an SLA of 99.99% to ensure workloads will continue to run.
- Once Spot capacity becomes available again, Ocean will automatically revert to using Spot Instances replacing OD nodes gracefully. This enables continuous infrastructure optimization allowing users to save money without having to manually intervene each time.
Termination of instances
Ocean works to ensure graceful termination of nodes and pods in the cluster. JThe following scenarios cause nodes to go down when using Ocean:
Since a Kubernetes cluster typically has many dynamic workloads, there will usually come a time when the nodes running in the cluster are no longer needed. When this happens, Ocean will identify half used nodes and bin packmore efficiently to achieve better resource allocation. Every minute, Ocean simulates if there are any running pods (starting with the least used nodes) that can be gracefully moved to other nodes in the cluster. If so, Ocean would drain those nodes (cord nodes, gracefully drain pods) while adhering to PDBs (Pod Disruption Budgets), to ensure continued infrastructure optimization and increased cloud savings.
When scaling down a node, Ocean uses a configurable flush delay of at least 300 seconds. At this time, Ocean marks the node as unschedulable and evicts all pods (respecting the terminateGracePeriodSeconds configured parameter) running on the node.
- When evicting a pod, a new one will be created on a different node in the cluster.
- After the purge time expires, Ocean will terminate the node and delete any pods that were not successfully purged.
- The ocean drainage process takes the PodDisruptionBudgets into account and will perform the eviction of the pods while respecting the configured PDBs (if possible)
Some workloads aren’t as resilient to instance replacements as others, so you might want to prevent node replacement while still benefiting from Spot Instance pricing. A good example of such cases are batch tasks/processes that need to complete their work without being interrupted by the Ocean autoscaler.
Ocean makes it easy to prevent scaling of nodes running pods configured with one of the following labels:
- spotinst.io/restrict-scale-down:true label – this label is a proprietary Spot label (additional spot labels), can be configured at the pod level. When configured, it instructs the Ocean autoscaler to prevent scaling down a node that is running a Pod with this specified label.
- cluster-autoscaler.kubernetes.io/safe-to-evict: false label – cluster-autoscaler label, works the same as restrict-scale-down label. Ocean supports this tag to ensure easy migration from cluster-autoscaler to Ocean.
Container instance replacement
There are several scenarios that Ocean is actively working on to replace existing nodes in the cluster:
- Back to Spot– Ocean launched on-demand nodes due to a lack of spot capacity. Once spot capacity becomes available again, Ocean strives to replace these on-demand nodes with spot capacity to maximize savings.
- Use IR/use SP – Ocean recognizes that there are unused Reserved Instances or Savings Plans that may be utilized and endeavors to replace existing Nodes with On-Demand Nodes that will utilize those reservations.
- self-healing – Ocean detects that the node(s) are becoming unhealthy and works to replace them.
- Predictive rebalancing– Ocean predicts that an interruption is about to occur and gradually replaces the node
- Punctual interruption – The instance is retrieved by the cloud provider.
- cluster roll– A user or a scheduled task triggers a cluster rollover and all (/subset) of the nodes in the cluster will be gradually replaced.
Ocean makes smart decisions to further optimize the cluster and ensure the required capacity is available.
Instance replacement behavior:
- First, Ocean will launch a new instance to replace the old one.
- Once the machine is registered to the cluster, Ocean will start flushing the old instance similar to a scale down event.
In the event of a one-time outage, Ocean will replace the instance and immediately initiate the process of draining the old instance, without waiting for the new instance to register as healthy in the cluster. In cases like this, having some spare capacity in the form of clearanceis very useful to allow safe emptying of the pods.
Spot Ocean provides a continuously optimized container-driven autoscaler to ensure infrastructure availability at the lowest possible cost. To learn learn more about Ocean this demo, or head to the https://docs.spot.io/ocean/to start.