Chaos Engineering with Redis

Tammy Butow
Principal SRE
Last Updated:
February 15, 2019
Categories:
Chaos Engineering
,

Introduction

Gremlin is a simple, safe and secure service for performing Chaos Engineering experiments through a SaaS-based platform. Redis is an open source in-memory data structure store. Datadog is a monitoring service for cloud-scale applications, providing monitoring of servers, databases, tools, and services, through a SaaS-based data analytics platform. Datadog provides an integration to monitor Redis.

Chaos Engineering Hypothesis

For the purposes of this tutorial we will run Chaos Engineering experiments on Redis. Our Chaos Engineering hypothesis is that we need to constantly ensure we are monitoring latency and crash frequency as these are common issues that can appear when running Redis in production. To begin with, view the guide on Problems With Redis published by the Redis team.

Latency measures the average time in milliseconds it takes the Redis server to respond. Typical Redis latency for a 1GBits/s network is about 200 μs. Tracking latency is the most useful way to see impacts to Redis performance.

We know from the official Redis Cluster documentation that a Redis Cluster does not guarantee strong consistency. This tutorial will focus on doing Chaos Engineering with one Redis instance. In a future tutorial we will focus on Redis Cluster.

Prerequisites

To complete this tutorial you will need the following:

  • 1 cloud infrastructure instance running Ubuntu 16.04
  • A Gremlin account (sign up here)
  • A Datadog account

Overview

This tutorial will walk you through the required steps to do Chaos Engineering with Redis.

  • Step 1 - Creating a Redis server
  • Step 2 - Installing Docker on each host
  • Step 3 - Installing Gremlin in a Docker container on each host
  • Step 4 - Installing Datadog in a Docker container on each host
  • Step 5 - Running the Redis Latency Chaos Engineering experiment
  • Step 6 - Additional Chaos Engineering experiments you can run with Gremlin

Step 1 - Creating a Redis server

First update the server:

BASH

sudo apt update

To compile Redis, run the following commands:

BASH

sudo apt-get install redis-server

Now start redis by running the following command:

BASH

redis-server --daemonize yes

A successful result will end with:

BASH

Configuration loaded

Now use the redis cli to confirm you can connect:

BASH

redis-cli

You will see the following as a successful result:

BASH

redis 127.0.0.1:6379>

Now type the following at the redis prompt:

BASH

ping

The successful result will be

BASH

OK

Now type the following to store data in Redis:

BASH

get test

The successful result will be:

BASH

"Time for Chaos Engineering with Redis!"

Now type the following to retrieve your stored data:

BASH

get test

The successful result will be:

BASH

"Time for Chaos Engineering with Redis!"

Now exit the Redis prompt:

BASH

exit

Restart Redis:

BASH

sudo systemctl restart redis

Now use the redis cli:

BASH

redis-cli

Type the following at the redis cli prompt to return the data you stored previously:

BASH

get test

You will see the following if it is successfully returned:

BASH

Output"Time for Chaos Engineering with Redis!"

Now exit the Redis prompt:

BASH

exit

Step 2 - Installing Docker

In this step, you’ll install Docker.

Add Docker’s official GPG key:

BASH

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Use the following command to set up the stable repository.

BASH

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Update the apt package index:

BASH

sudo apt-get update

Make sure you are about to install from the Docker repo instead of the default Ubuntu 16.04 repo:

BASH

apt-cache policy docker-ce

Install the latest version of Docker CE:

BASH

sudo apt-get install docker-ce

Docker should now be installed, the daemon started, and the process enabled to start on boot. Check that it is running:

BASH

sudo systemctl status docker

Type <span class="code-class-custom">q</span> to return to the prompt.

Make sure you are in the Docker usergroup, replace redis with your username:

BASH

sudo usermod -aG docker redis

Step 3 - Installing Gremlin On Each Host

After you have created your Gremlin account (sign up here) you will need to find your Gremlin Daemon credentials. Login to the Gremlin App using your Company name and sign-on credentials. These were emailed to you when you signed up to start using Gremlin.

Navigate to Team Settings and click on your Team.

Store your Gremlin agent credentials as environment variables, for example:

BASH

export GREMLIN_TEAM_ID=3f242793-018a-5ad5-9211-fb958f8dc084

BASH

export GREMLIN_TEAM_SECRET=eac3a31b-4a6f-6778-1bdb813a6fdc

Next run the Gremlin Daemon in a Container.

Use docker run to pull the official Gremlin Docker image and run the Gremlin daemon:

BASH

sudo docker run -d \      --net=host \      --pid=host \      --cap-add=NET_ADMIN \      --cap-add=SYS_BOOT \      --cap-add=SYS_TIME \     --cap-add=KILL \      -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \      -e GREMLIN_TEAM_SECRET="${GREMLIN_TEAM_SECRET}" \      -v /var/run/docker.sock:/var/run/docker.sock \      -v /var/log/gremlin:/var/log/gremlin \      -v /var/lib/gremlin:/var/lib/gremlin \    gremlin/gremlin daemon

Use docker ps to see all running Docker containers:

BASH

sudo docker ps

BASH

CONTAINER ID        IMAGE                COMMAND                  CREATED             STATUS              PORTS                    NAMESb281e749ac33        gremlin/gremlin      "/entrypoint.sh daem…"   5 seconds ago       Up 4 seconds                                 relaxed_heisenberg

Jump into your Gremlin container with an interactive shell (replace b281e749ac33 with the real ID of your Gremlin container):

BASH

sudo docker exec -it b281e749ac33 /bin/bash

From within the container, check out the available attack types:

BASH

gremlin help attack-container

BASH

Usage: gremlin attack-container CONTAINER TYPE [type-specific-options]Type "gremlin help attack-container TYPE" for more details:  blackhole # An attack which drops all matching network traffic  cpu   # An attack which consumes CPU resources  io    # An attack which consumes IO resources  latency # An attack which adds latency to all matching network traffic  memory  # An attack which consumes memory  packet_loss # An attack which introduces packet loss to all matching network traffic    shutdown  # An attack which forces the target to shutdown  dns   # An attack which blocks access to DNS servers  time_travel # An attack which changes the system time.  disk    # An attack which consumes disk resources  process_killer  # An attack which kills the specified process

Step 4 - Installing the Datadog agent in a Docker container

To install Datadog in a Docker container you can use the Datadog Docker easy one-step install.

Run the following command, replacing the item in red with your own API key:

BASH

sudo docker run -d --name dd-agent -v /var/run/docker.sock:/var/run/docker.sock:ro -v /proc/:/host/proc/:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e DD_API_KEY=7cfe87ce1756aea datadog/agent:latest

It will take a few minutes for Datadog to spin up the Datadog container, collect metrics on your existing containers and display them in the Datadog App.

Step 5 - Running the Redis Latency Chaos Engineering Experiment

We will use the Gremlin CLI attack command to create a latency attack.

Now use the Gremlin CLI (gremlin) to run a latency attack against the host from a Gremlin container:

BASH

sudo docker run -d \      --net=host \      --pid=host \      --cap-add=NET_ADMIN \      --cap-add=SYS_BOOT \      --cap-add=SYS_TIME \      --cap-add=KILL \      -e GREMLIN_TEAM_ID="${GREMLIN_TEAM_ID}" \      -e GREMLIN_TEAM_SECRET="${GREMLIN_TEAM_SECRET}" \      -v /var/run/docker.sock:/var/run/docker.sock \      -v /var/log/gremlin:/var/log/gremlin \      -v /var/lib/gremlin:/var/lib/gremlin \    gremlin/gremlin attack latency

This attack will inject latency to the Redis host.

Now exit the container by running the following command:

BASH

exit

Step 6 - Additional Chaos Engineering experiments to run on Redis

There are many Chaos Engineering experiments you could possibly run on your Redis infrastructure:

  • Monitoring - do your monitoring tools enable you to monitor changes in Redis latency?
  • Time Travel Gremlin - will changing the clock time of the host impact how Redis processes data?
  • Latency & Packet Loss Gremlins - will they impact the ability to use Redis
  • Disk Gremlin - will filling up the disk crash the host?

We encourage you to run these Chaos Engineering experiments and share your findings! To get access to Gremlin, sign up here.

Conclusion

This tutorial has explored how to install Redis and Gremlin in Docker containers for your Chaos Engineering experiments. We then ran a shutdown Chaos Engineering experiment on the Redis container using the Gremlin Latency attack.

No items found.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your trial

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.GET STARTED

Product Hero ImageShape