Datadog is a service that provides monitoring of servers, databases, tools, and services through a SaaS-based data analytics platform. If you already use Datadog, it is useful to enable the Gremlin Datadog integration for your Engineering team to see the Gremlins in action while they are running attacks.

With this integration enabled, you will be able to overlay attacks on top of your dashboards to pinpoint exactly when Gremlin events occur, and how the attack is impacting your metrics.

Gremlin Events Datadog

You can also click from within your Datadog Event Stream to:

  • Rerun Gremlin attacks
  • Show logs for Gremlin attacks
  • Halt Gremlin attacks

Setting up the Gremlin Datadog integration

To activate this integration, you will need to enter your Datadog API key into Gremlin.

First, retrieve your API key from Datadog. You can find it in Datadog Settings.

API Key Datadog

Next, add the Datadog API key to Gremlin Settings by clicking on Integrations and the Add button on the row for Datadog.

You will be prompted for your Datadog API key. Paste your Datadog API key in the box and click save. The Gremlin Datadog integration will now be initialized.

API Key Datadog Gremlin

Halting Attacks using the Datadog webhook

Alerts can be used to trigger the webhook integration in Datadog to send an API call to Gremlin that will halt all attacks in progress.

Go to the alert you would like to use to trigger the webhook and add @webhook-Gremlin-Halt-All to the Say what's happening Field

In the Datadog webhook integration, name your webhook Gremlin-Halt-All and use the url Check the box for a custom payload, and add the JSON {"reason": "$ALERT_STATUS", "reference": "$LINK"}. When posted, this JSON will add the alert as a reason for halting any ongoing attacks, and a link back to the alert in Datadog. For the headers, add the JSON {"Authorization": "Key your-api-key-here"} to authorize the API call.

Additional resources

Gremlin’s Developer Guide is a great resource and reference for using Gremlin to do Chaos Engineering. You can view additional Gremlin Attacks including attacks that impact State and Network. You can also explore the Gremlin Blog for more information on how to use Chaos Engineering with your application infrastructure.