Reliability Management
Supported platforms:
Prevent outages, innovate faster, and earn customer trust with Gremlin’s Reliability Management and Chaos Engineering platform.
This section contains documentation for using Gremlin Reliability Management (RM).
Gremlin RM lets you run tests on services within your environment. It tests several key reliability behaviors of each service including its scalability, redundancy, and ability to tolerate failed or slow dependencies. Gremlin then generates and assigns a reliability score to the service based on the outcome of these tests.
Gremlin defines a service as a process running on one or more hosts, containers, or Kubernetes resources. For example, a Java application deployed across three hosts is a service. A Kubernetes Deployment or ReplicaSet is also a service. This design makes it easier to test distributed applications and is more closely aligned with how teams build, test, and deploy applications.