This is Fine: The SRE's Guide to Chaos & Observability
Today’s distributed, cloud-based environments are incredibly complex. Not only does each component depend on many others, but modern systems are also highly dynamic—changing frequently as teams push new code or make updates to infrastructure.
Taming this complexity to ensure reliability requires end-to-end observability to understand how components depend on each other. Additionally, proactive Chaos Engineering combined with AI-driven observability lets you uncover “unknown unknowns” that impact how your system will respond to different failure scenarios.
Thank you for registering for this on-demand event. You will receive an email momentarily with a link to watch the session.
About this webinar
Join Gremlin and Dynatrace as we discuss techniques for maintaining and improving reliability in complex cloud environments. We will cover how to establish end-to-end observability across your environments and how to map their complex relationships. We will then provide a framework for safely and thoughtfully conducting Chaos Engineering experiments with Gremlin.
Finally, we will share how teams can incorporate continuous chaos experimentation into build and deploy pipelines using the concept of “quality gates” in Dynatrace to help you establish and adhere to reliability SLOs.
- Learn the history, principles and practice of Chaos Engineering
- Discover how to improve your teams on-call skills
- How observability and chaos work together to improve the reliability of distributed systems
- How to use Gremlin and Dynatrace to enable your engineering team to have continuous improvement
Proactively improve reliability
Explore our tutorials to learn about the technologies and processes that help you manage reliability to a higher standard