Kubernetes node redundancy

Description

Test your Kubernetes cluster's node redundancy by dropping all network traffic to a node.

Hypothesis

When a node fails, Kubernetes automatically re-routes traffic to healthy nodes and recreates failed pods.

‍

Why run this Scenario?

Network connections can fail for any number of reasons, including:

Failures in downstream systems (dependencies) or networking hardware.
Misconfigured firewall and router rules, such as the 2021 Fastly outage.
Saturation caused by unexpected surges in user traffic, high-bandwidth data transfers, or other causes.
Applications with poorly configured connection, timeout, and/or retry logic.

With this Scenario, you can validate that:

Your Kubernetes cluster can gracefully handle losing a node.
Your master nodes are redundant and won't create a split-brain situation.
Load balancers and API gateways are configured properly.