Kubernetes node redundancy
Description
Test your Kubernetes cluster's node redundancy by dropping all network traffic to a node.
Hypothesis
When a node fails, Kubernetes automatically re-routes traffic to healthy nodes and recreates failed pods.
Why run this Scenario?
Network connections can fail for any number of reasons, including:
- Failures in downstream systems (dependencies) or networking hardware.
- Misconfigured firewall and router rules, such as the 2021 Fastly outage.
- Saturation caused by unexpected surges in user traffic, high-bandwidth data transfers, or other causes.
- Applications with poorly configured connection, timeout, and/or retry logic.
With this Scenario, you can validate that:
- Your Kubernetes cluster can gracefully handle losing a node.
- Your master nodes are redundant and won't create a split-brain situation.
- Load balancers and API gateways are configured properly.