Linux host redundancy

Description

Test resilience to host failures by shutting down a randomly selected Linux host. Verify that your platform automatically restarts or replaces it.

What this Scenario does

This Scenario shuts down a randomly selected Linux host, simulating an unexpected host failure. This forces your infrastructure to detect the failure and initiate recovery—whether through auto-scaling groups, load balancer health checks, or manual failover processes.

Why run  this Scenario?

This Scenario uses the same principle as Chaos Monkey: if a host or container shuts down unexpectedly, the underlying platform should detect this and automatically restart or replace it.

  • Validate that Linux instances restart within a reasonable timeframe and workloads successfully migrate to healthy hosts.
  • Verify that load balancers automatically route traffic away from the failed Linux host.
  • Test that losing a critical node (such as a Kafka broker or database primary) doesn't cause a split-brain scenario.
  • Build the same confidence as Netflix's Chaos Monkey approach: if a host shuts down unexpectedly, the platform handles it automatically.

Expected outcome

When a Linux host fails, the cloud platform or infrastructure automatically restarts or replaces it, and workloads migrate to healthy instances.

Target
Linux
Linux
Experiments
Shutdown
Shutdown
Preview
Runtime:  
5 minutes