How to Install and Use Gremlin on Ubuntu 16.04

Introduction

This tutorial will provide a walkthrough to install Gremlin on Ubuntu 16.04 and then perform a Chaos Engineering experiment using a Gremlin CPU attack.

Prerequisites

Before you begin this tutorial, you'll need the following:

  • An Ubuntu 16.04 server
  • A Gremlin account
  • The apt-transport-https package to be able to install gremlin from our repo via HTTPS.

Step 1 - Installing Gremlin

In this step, you'll install Gremlin

First, ssh into your host and add the gremlin repo:

ssh username@your_server_ip

echo "deb https://deb.gremlin.com/ release non-free" | sudo tee /etc/apt/sources.list.d/gremlin.list

Import the GPG key:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys C81FC2F43A48B25808F9583BDFF170F324D41134 9CDB294B29A5B1E2E00C24C022E8EF3461A50EF6

Then install the Gremlin client and daemon:

sudo apt-get update && sudo apt-get install -y gremlin gremlind

Step 2 - Downloading your Gremlin client certificates

After you have created your Gremlin account (sign up here) you will need to find your Gremlin Daemon credentials. Login to the Gremlin App using your Company name and sign-on credentials. These were emailed to you when you signed up to start using Gremlin.

Navigate to Team Settings and click on your Team. Click the blue Download button to save your certificates to your local computer. The downloaded certificate.zip contains both a public-key certificate and a matching private key.

certificates

Unzip the downloaded certificate.zip on your laptop and copy the files to the server you will be using with a Linux file transfer tool such as rsync, sftp or scp. Alternatively, you can store these certificates in a storage service such as AWS S3. For example:

rsync -avz /Users/tammybutow/Desktop/tammy-client.pub_cert.pem tammy@142.93.31.189:/var/lib/gremlin
rsync -avz /Users/tammybutow/Desktop/tammy-client.priv_key.pem tammy@142.93.31.189:/var/lib/gremlin

**Creating a gremlind file for your environment variables **

Next create the /etc/default/gremlind file:

sudo vim /etc/default/gremlind

Add your GREMLIN environment variables to the file, for example:

GREMLIN_TEAM_ID="3f242793-018a-5ad5-9211-fb958f8dc084"GREMLIN_TEAM_CERTIFICATE_OR_FILE="file:///var/lib/gremlin/tammy-client.pub_cert.pem"GREMLIN_TEAM_PRIVATE_KEY_OR_FILE="file:///var/lib/gremlin/tammy-client.priv_key.pem"GREMLIN_CLIENT_TAGS="service=prometheus"

Save the file. Restart the service:

sudo service gremlind restart

Confirming your gremlind configuration

Take a look at /var/log/gremlin/daemon.log to confirm:

tail /var/log/gremlin/daemon.log

You should see an output similar to below if it was successful:

2018-10-31 02:34:20 - Logging successfully initialized2018-10-31 02:34:23 - Using Team ID : 3f242793-018a-5ad5-9211-fb958f8dc0842018-10-31 02:34:23 - Using Identifier : 142.93.31.1892018-10-31 02:34:23 - Found GREMLIN_TEAM_CERTIFICATE_OR_FILE in file:///var/lib/gremlin/tammy-client.pub_cert.pem2018-10-31 02:34:23 - Found GREMLIN_TEAM_PRIVATE_KEY_OR_FILE in file:///var/lib/gremlin/tammy-client.priv_key.pem

Step 3 - Creating attacks using the Gremlin App

Login to the Gremlin App using your Company name and sign-on credentials. These details were emailed to you when you created your Gremlin account.

Select Create Attack in the Gremlin App.

Example: The Hello World of Chaos Engineering (a CPU attack)

You can use the Gremlin App or the Gremlin API to trigger Gremlin attacks. You can view the available range of Gremlin Attacks in Gremlin Help.

The Hello World of Chaos Engineering is the CPU Resource Attack. To create a CPU Resource Attack select Resource and then CPU in the dropdown menu.

The CPU Resource Attack will consume CPU resources based on the settings you select. The most popular default settings for a CPU Resource Attack are pre-selected, a default attack will utilize 1 core for 60 seconds. Before you can run the Gremlin attack you will need to click either Exact hosts to run the attack on or click the Random attack option.

Click Exact and select a Gremlin Client in the list.

Your attack will begin to run, you will be able to view its progress via Gremlin Attacks in the Gremlin Control Panel.

On your server, run top to check the impact of the Gremlin Attack:

$ top

top - 06:26:47 up 7 days,  7:00,  1 user,  load average: 0.28, 0.07, 0.02
Tasks: 105 total,   1 running, 104 sleeping,   0 stopped,   0 zombie
%Cpu(s): 79.7 us, 20.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1016120 total,   127140 free,    93956 used,   795024 buff/cache
KiB Swap:        0 total,        0 free,        0 used.   712192 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND     
23768 gremlin   20   0   13268  11136   3576 S 99.3  1.1   0:14.05 gremlin     
23766 root      20   0   40388   3600   3072 R  0.3  0.4   0:00.03 top         
    1 root      20   0   37760   5760   3940 S  0.0  0.6   0:13.74 systemd     
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd    
    3 root      20   0       0      0      0 S  0.0  0.0   0:01.28 ksoftirqd/0
    5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H
    7 root      20   0       0      0      0 S  0.0  0.0   0:06.14 rcu_sched   
    8 root      20   0       0      0      0 S  0.0  0.0   0:00.00 rcu_bh      
    9 root      rt   0       0      0      0 S  0.0  0.0   0:00.00 migration/0
   10 root      rt   0       0      0      0 S  0.0  0.0   0:04.09 watchdog/0  

When your attack is complete it will move to Completed Attacks.

Step 4 - Halting a CPU resource attack using the Gremlin Control Panel

You can stop a Gremlin Attack at anytime using the Gremlin Control Panel. Navigate to Gremlin Attacks and click on the Halt button.

Conclusion

You've installed Gremlin on a server running Ubuntu 16.04 and validated that Gremlin works by running the Hello World of Chaos Engineering, the CPU Resource attack. You now possess tools that make it possible for you to explore additional Gremlin Attacks including attacks that impact State and Network.

Gremlin's Developer Guide is a great resource and reference for using Gremlin to do Chaos Engineering. You can also explore the Gremlin Blog for more information on how to use Chaos Engineering with your application infrastructure.

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. Try Gremlin for free and see how you can harness chaos to build resilient systems.