Windows host redundancy
Description
Test resilience to host failures by shutting down a randomly selected Windows host. Verify that your platform automatically restarts or replaces it.
What this Scenario does
This Scenario shuts down a randomly selected Windows Server host, simulating an unexpected host failure. This forces your infrastructure to detect the failure and initiate recovery through Windows failover clustering, cloud auto-scaling, or load balancer health checks.
Why run this Scenario?
This Scenario uses the same principle as Chaos Monkey: if a host or container shuts down unexpectedly, the underlying platform should detect this and automatically restart or replace it.
- Validate that Windows Server instances restart and rejoin the cluster within acceptable timeframes.
- Verify that Windows services exit gracefully and restart cleanly after an unexpected shutdown.
- Test that load balancers automatically route traffic away from the failed Windows host.
- Confirm that Windows failover clustering promotes a secondary node when the primary fails.
Expected outcome
When a Windows host fails, the cloud platform or infrastructure automatically restarts or replaces it, and workloads migrate to healthy instances.