Mastering Fault Tolerance: Ensuring High Availability for Kubernetes Pods
Are Kubernetes pods highly available by default? - Partially true! :p
Certainly! Let's consider a scenario where you have deployed a cluster (let's stick with EKS for simplicity and my preference) and have deployed node groups across 3 availability zones, consisting of 6 worker nodes. Now, when scheduling a deployment with 3 replicas, can you guarantee that they will be evenly spread across the nodes?
The answer is no!
It's challenging to expect such even distribution based on my exploration. However, Kubernetes provides an out-of-the-box solution to address this problem. This solution requires some preparation and planning during the setup stage.
We can delve into this detail in our blog post.
Strategies for Making Pods Highly Available in Kubernetes
Node affinity: Node affinity is similar to node selector with additional flexibility. There are two types of node affinity.
requiredDuringSchedulingIgnoredDuringExecution: The scheduler can't schedule the Pod unless the rule is met. This functions like nodeSelector, but with a more expressive/customizable/flexible syntax.**
preferredDuringSchedulingIgnoredDuringExecution: The scheduler tries to find a node that meets the rule. If a matching node is not available, the scheduler still schedules the Pod.**
The above will help us to attract pods to respective nodes based on the labels and constraints.
To deploy the pods across the worker nodes we use
Pod Topology Spread Constraints
TopologySpreadConstraints describes how a group of pods ought to spread across topology domains. The scheduler will schedule pods in a way that abides by the constraints. All topologySpreadConstraints are ANDed.**
Example definition file for the same
apiVersion: v1 kind: Pod metadata: name: example-pod spec:
Configure a topology spread constraint
---
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
# Configure a topology spread constraint
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabelKeys:
- app-name
### other Pod fields go here
maxSkew - The extent of uneven distribution in Kubernetes is defined by a parameter, which is set to a default value of 1. This means that at least one pod should be scheduled on each node.
For example, in a cluster with three zones, the initial deployment may have 2 pods in zone 1, 2 pods in zone 2, and 1 pod in zone 3 (2/2/1 distribution). If the deployment is scaled up to 6 pods, the distribution across zones would be adjusted to 2 pods in each zone (2/2/2 distribution).
topologyKey - in simplified terms, it defines a group of nodes using its labels kubernetes.io/hostname - is the node label and it recognizes all nodes with this label as topology and each server inside it is a domain.
labelSelector - used to select matching pods
The above definition will help us to deploy and spread the pods across the nodes making it highly available.
Other Useful Tools:
NodeLabeller - Auto labels node if it is GPU processor.
Descheduler - To effectively utilize resources in the nodes.
Thank you!
**definition from the k8s documentation
References:
https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller