How to test your systems for scalability and redundancy with Fault Injection
Register Now
Thank you for registering! Click here to watch on-demand.
Do you know if your services can tolerate losing a node? What about an entire availability zone? Or a region?
Large-scale outages aren’t unheard of. When you’re running critical services, it’s vital that those services can keep running even if an AZ or region fails. In addition to failing over, these services also need to scale quickly so traffic shifts don’t overwhelm your systems. How do you prove that a service is both scalable and redundant? The answer is with Fault Injection.
In this webinar, we’ll show you how to test the scalability and redundancy of your systems by testing them directly. We’ll use Fault Injection to simulate large-scale failures, use observability tools to monitor the state of our systems, and discuss ways of using our findings to make our systems more resilient.
You'll learn:
- What is Fault Injection? Learn how simulating incidents is the first step towards resolving them.
- How to run blackhole and shutdown experiments using Gremlin.
- How to use observability to monitor your system's response, then use these insights to make reliability improvements.
Proactively improve reliability
Explore our tutorials to learn about the technologies and processes that help you manage reliability to a higher standard
Avoid downtime. Use Gremlin to turn failure into resilience.
Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.