How to Install and use Gremlin with EKS

How to Install and use Gremlin with EKS


Gremlin is a simple, safe and secure service for performing Chaos Engineering experiments through a SaaS-based platform. This tutorial will walk through how to install Gremlin on Amazon’s Managed Kubernetes Service (EKS) with a demo environment and perform a Chaos Engineering experiment using a Gremlin Shutdown attack.


Before you begin this tutorial, you’ll need the following:


This tutorial will walk you through the required steps to run an EKS cluster, deploy two applications and then run a Chaos Engineering experiment using Gremlin.

  • Step 0 - Verify your account AWS CLI Installation
  • Step 1 - Create an EKS cluster using eksctl
  • Step 2 - Load up the kubeconfig for the cluster
  • Step 3 - Install Gremlin using the Kubernetes Dashboard
  • Step 4 - Deploy a Microservice Demo Application
  • Step 5 - Run a Shutdown Container Attack using Gremlin

Step 0 - Verify your account AWS CLI Installation

In this step, you’ll first verify that you have your AWS CLI configured to use eksctl to create the EKS cluster:

 aws --version

This should give you an output similar to:

aws-cli/1.16.150 Python/3.7.3 Darwin/18.5.0 botocore/1.12.140

If you’re having issues, refer back to the AWS CLI Installation documentation.

Step 1 - Create an EKS cluster using eksctl

For this tutorial, we are going to use Weave Work’s open source tool, eksctl, to create our EKS clusters. On your local machine, install eksctl:

curl --silent --location "\_release/eksctl\_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp sudo mv /tmp/eksctl /usr/local/bin

After installing eksctl, create a basic cluster:

eksctl create cluster

This will create a cluster and the needed resources in us-west-2. It will auto-generate a cluster name, create 2 m5.large ec2 instances using the official AWS EKS AMI, and set up a dedicated VPC.

Step 2 - Load up the kubeconfig for the cluster

Verify that the eks cluster has been set up properly:

eksctl get clusters

The output should display the name of your cluster and the region similar to:


gremlin-eks	fabulous-mushroom-1527688624

You can now grab the kubeconfig file from AWS using the AWS CLI and passing the cluster name and region:

sudo aws eks --region us-west-2  update-kubeconfig --name fabulous-mushroom-1527688624

To averify the hosts that eksctl has setup for us, run:

kubectl get nodes

Step 2 - Deploy Kubernetes Dashboard

We now want to deploy the Kubernetes dashboard, heapster and influxdb.

To deploy the dashboard to your EKS cluster:

kubectl apply -f

To deploy heapster:

 kubectl apply -f

To deploy influxdb:

 kubectl apply -f

Now create the heapster cluster role binding for the dashboard and cluster role binding.

kubectl apply -f

We now want to create an eks-admin service account, this will let you connect to the kubernetes dashboard with admin permissions. To authenticate and use the Kubernetes dashboard:

kubectl apply -f

To connect to the Kubernetes dashboard, first, authentication token for the eks-admin-service account:

kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')

In your local machine deploy the Kubernetes dashboard:

kubectl proxy

On a web browser, access the dashboard by visiting this URL.


To sign in, select token and use the output that the previous step gave us

Step 3 - Install Gremlin using Helm

Step 3.1 - Download your Gremlin certificates

After you have created your Gremlin account (sign up here) you will need to find your Gremlin Daemon credentials. Login to the Gremlin App using your Company name and sign-on credentials. These were emailed to you when you signed up to start using Gremlin.

Navigate to Team Settings and click on your Team. Click the blue Download button to save your certificates to your local computer. The downloaded contains both a public-key certificate and a matching private key.

Unzip the and save it to your gremlin folder on your desktop. Rename your certificate and key files to gremlin.cert and gremlin.key.

Gremlin cert and key

Step 3.2 - Create gremlin namespace and secret

Create a Kubernetes namespace for Gremlin:

kubectl create namespace gremlin

Create a Kubernetes secret for your certificate and private key:

kubectl create secret generic gremlin-team-cert \
\--namespace=gremlin  \
\--from-file=/path/to/gremlin.cert \

Step 3.3 - Installation with Helm

The simplest way to install the Gremlin client on your Kubernetes cluster is to use Helm. If you do not already have Helm installed, go here to get started. Once Helm is installed and configured, the next steps are to add the Gremlin repo and install the client.

To run the Helm install, you will need your Gremlin Team ID. It can be found in the Gremlin app on the Team Settings page, where you downloaded your certs earlier. Click on the name of your team in the list. The ID you’re looking for is found under Configuration as Team ID.

Export your Team ID as an environment variable:


Replace YOUR_TEAM_ID with the Team ID you obtained from the Gremlin UI.

Next, export your cluster ID, which is just a friendly name for your Kubernetes cluster. It can be whatever you want.

export GREMLIN_CLUSTER_ID="Your cluster id"

Now add the Gremlin Helm repo, and install Gremlin:

helm repo add gremlin
helm install gremlin/gremlin \
\--namespace gremlin \
\--name gremlin \
\--set gremlin.teamID=$GREMLIN_TEAM_ID \
\--set gremlin.clusterID=$GREMLIN_CLUSTER_ID

For more information on the Gremlin Helm chart, including more configuration options, check out the chart on Github.

Step 4 - Deploy a Microservice Demo Application

The demo environment we are going to deploy on to our EKS cluster is the Hipster Shop: Cloud-Native Microservices Demo Application

On your local machine clone the repo:

git clone

Then, change directories to the directory we have just created:

cd microservices-demo

To deploy the application:

kubectl apply -f ./release/kubernetes-manifests.yaml

Wait until pods are in a ready state. To check the readiness run:

kubectl get pods

Grab the ip address the frontend lives on:

kubectl get svc frontend-external -o wide

The output is the URL you’ll visit using your web browser and it looks like this:

Visit the URL on your browser

Step 5 - Run a Shutdown Container Attack using Gremlin

We are going to create our first Chaos Engineering experiment. We want to validate EKS reliability. Our hypothesis is, “When shutting down my cart service container, I will not suffer downtime and EKS will give me a new one.”

Going back to the Gremlin UI, select Attacks from the menu on the left and press the green “New Attack” button. We’re going to target a Kubernetes resource, so click Kubernetes in the upper right.

Gremlin UI

We will be shutting down the “cartservice” containers. Gremlin has imported the objects from Kubernetes and we can see them in the UI. We can find the container we want to target by expanding the Deployments field and selecting cartservice.

We will now go over to choosing the gremlin. We will be a doing a state Chaos Engineering Attack, so select “State” and choose “Shutdown” from the options. We will leave the delay set to 1 minute and turn off the reboot. Then click on the green Unleash Gremlin button.

Unleash Gremlin

Head back over to the Kubernetes dashboard, and select pods on the left menu bar to display the pod’s state. Also, make sure to check out the demo app to test user experience to see if your hypothesis is correct.

Experiment Results

Our hypothesis was, "When shutting down my cart service container, I will not suffer downtime and EKS will give me a new one."

We didn't prove this to be correct. We actually saw that the Hipster Shop: Cloud-Native Microservices Demo Application demo did not gracefully handle shutdown. It instead threw a 500 internal server error. To mitigate this issue we would need to first investigate why we saw the error and look into the logs. For example, we can see the error "could not retrieve cart". When we run kubectl get pods we will see there is only one cartservice running and it has no redundancy.

When we view cartservice.yaml we see that cart service uses redis but it does not use clustered redis:


Congrats! You’ve set up an AWS EKS cluster, deployed the Kubernetes Dashboard, deployed a microservice demo application, installed the Gremlin agent as a daemon set, and ran your first Chaos Engineering attack to validate Kubernetes reliability! If you have any questions at all or are wondering what else you can do with this demo environment, feel free to DM me on the Chaos Slack: @anamedina (join here!).


Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. Use Gremlin for Free and see how you can harness chaos to build resilient systems.

Use For Free