Installing the Failure Flags agent
This document will walk you through setting up the Failure Flags agent. The Failure Flags agent runs alongside your application and is responsible for managing Chaos Engineering experiments and reliability tests.
You'll need to use one of two versions of the agent, depending on your environment:
- Gremlin-Lambda is for applications running on AWS Lambda.
- Gremlin-Sidecar is a generic container sidecar agent for applications running on Amazon ECS and similar managed container platforms.
Gremlin-Lambda is a Lambda Extension that you can add to your Lambda Functions. Gremlin-Lambda supports both AMD64/x86_64 and ARM64 architectures. You can learn more about the AWS Lambda Extensions API in the AWS documentation.
Before deploying the Extension, you'll need to configure the following environment variables:
FAILURE_FLAGS_ENABLED: to enable Failure Flags, set this to
1. Otherwise, the agent won't fetch experiments.
GREMLIN_LAMBDA_ENABLED: to enable Gremlin-Lambda, set this to
GREMLIN_TEAM_ID: this is your Gremlin Team ID. You can retrieve your Team ID from the bottom-left corner of the Gremlin web app.
GREMLIN_TEAM_CERTIFICATE: this contains the contents of your Gremlin Team certificate. Read Signature-based authentication for more information on generating, downloading, and using certificates.
GREMLIN_TEAM_PRIVATE_KEY: this contains the contents of your Gremlin Team private key. Read Signature-based authentication for more information on generating, downloading, and using private keys.
GREMLIN_TEAM_PRIVATE_KEYwill accept content either with the newline characters preserved, or with them omited. If you're providing those through a web UI it might be easier to take the newlines out.
The Extension's Amazon Resource Name (ARN) varies depending on where your Lambda Function is deployed and which architecture you're using. Retrieve the correct ARN for your architecture and AWS region from this table.
Once you have the correct ARN, add the extension to your Lambda Function by following the instructions in the AWS Lambda Developer Guide.
If your project is already using several Lambda Layers and is at risk of reaching the limit, you can integrate the Failure Flags binary and package into your own layer. For more information, see the AWS Lambda documentation.
Gremlin provides a generic container image for use in containerized environments. Gremlin supports the Docker, containerd, and CRI-O runtimes. To install the container image, follow the instructions in the Docker section of Installing Gremlin on a virtual machine. You may need to adjust these instructions depending on your container management tool (Docker, Podman, Rancher, etc.).
To verify that you installed the agent correctly, log into the Gremlin web app and navigate to Agents > Application. You should see your function listed. You'll also be able to select the function as a target when creating a new experiment.