How to Use Gremlin Scenarios to Reproduce the AWS S3 Outage

Tammy Butow
Principal SRE
Last Updated:
September 24, 2019
Categories:
Chaos Engineering
,

Overview

Gremlin is a simple, safe and secure service for performing Chaos Engineering experiments through a SaaS-based platform.

This tutorial will show you how to use Gremlin Scenarios to reproduce the AWS S3 Outage.

If you are interested in learning more about the outage, we have shared a detailed analysis of the 2017 S3 Outage on our Gremlin Blog. A good question to ask yourself at every postmortem is “how do we ensure this never happens again?”

  • Step 1 - Create a Sample App
  • Step 2 - Build and run your Sample app using Docker and NGINX
  • Step 3 - View your Sample app in your browser
  • Step 4 - View your Sample app via VNC
  • Step 5 - Running the Gremlin Unavailable Dependencies Scenario to reproduce the S3 outage
  • Step 6 - Viewing the result of the Gremlin Unavailable Dependencies Scenario

Prerequisites

Step 1 - Create a Sample App

Connect to your host with ssh and create a Dockerfile:

BASH

ssh username@your_server_ip

vim Dockerfile

FROM nginx:alpine
COPY index.html /usr/share/nginx/html/index.html

Create the following index.html file:

BASH

vim index.html

HTML

<html lang="en">
  <head>
    <title>Mythical Mysfits</title>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
    <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.6/angular.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.3/umd/popper.min.js"></script>
    <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/js/bootstrap.min.js"></script>
    <link
      rel="stylesheet"
      href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css"
    />
  </head>

  <body ng-app="mysfitsApp">
    <style>
      @media (max-width: 800px) {
        img {
          max-width: 300px;
        }
      }
    </style>

    <div style="text-align: center;">
      <img
        src="https://www.mythicalmysfits.com/images/aws_mythical_banner.png"
        width="800px"
        align="center"
      />
    </div>

    <div class="container" ng-controller="mysfitsFilterController">
      <div id="filterMenu">
        <ul class="nav nav-pills">
          <li
            class="nav-item dropdown"
            ng-repeat="filterCategory in filterOptionsList.categories"
          >
            <a
              class="nav-link dropdown-toggle"
              data-toggle="dropdown"
              href="#!"
              role="button"
              aria-haspopup="true"
              aria-expanded="false"
            >
              {{filterCategory.title}}
            </a>

            <div class="dropdown-menu">
              <button
                class="dropdown-item"
                ng-repeat="filterCategorySelection in filterCategory.selections"
                ng-click="queryMysfits(filterCategory.title, filterCategorySelection)"
              >
                {{filterCategorySelection}}
              </button>
            </div>
          </li>

          <li class="nav-item ">
            <button
              type="button"
              class="btn btn-success"
              ng-click="removeFilter()"
            >
              View All
            </button>
          </li>
        </ul>
      </div>
    </div>

    <br />

    <div class="container">
      <div id="mysfitsGrid" class="row" ng-controller="mysfitsListController">
        <div
          class="col-md-4 border border-warning"
          ng-repeat="mysfit in mysfits"
        >
          <br />

          <p align="center">
            <strong> {{mysfit.name}}</strong>

            <br />

            <img src="{{mysfit.thumbImageUri}}" alt="{{mysfit.Name}}" />
          </p>

          <p>
            <br />
            <b>Species:</b> {{mysfit.species}}
            <br />
            <b>Age:</b> {{mysfit.age}}
            <br />
            <b>Good/Evil:</b> {{mysfit.goodevil}}
            <br />
            <b>Lawful/Chaotic:</b> {{mysfit.lawchaos}}
          </p>
        </div>
      </div>
    </div>

    <p>
      <br />
      <br />
      This site was created for use in the AWS Modern Application Workshop.
      <a href="https://github.com/aws-samples/aws-modern-application-workshop"
        >Please see details here.</a
      >
    </p>
  </body>

  <script>
    var mysfitsApiEndpoint =
      'http://mysfits-nlb-9c8e61c17ef3cd1d.elb.us-east-1.amazonaws.com';
    var app = angular.module('mysfitsApp', []);
    var gridScope;
    var filterScope;

    app.controller('clearFilterController', function($scope) {});

    app.controller('mysfitsFilterController', function($scope) {
      filterScope = $scope;

      // The possible options for Mysfits to populate the dropdown filters.
      $scope.filterOptionsList = {
        categories: [
          {
            title: 'Good/Evil',
            selections: ['Good', 'Neutral', 'Evil'],
          },
          {
            title: 'Lawful/Chaotic',
            selections: ['Lawful', 'Neutral', 'Chaotic'],
          },
        ],
      };

      $scope.removeFilter = function() {
        allMysfits = getAllMysfits(applyGridScope);
      };

      $scope.queryMysfits = function(filterCategory, filterValue) {
        var filterCategoryQS = '';

        if (filterCategory === 'Good/Evil') {
          filterCategoryQS = 'GoodEvil';
        } else {
          filterCategoryQS = 'LawChaos';
        }

        var mysfitsApi =
          mysfitsApiEndpoint +
          '/mysfits?' +
          'filter=' +
          filterCategoryQS +
          '&value=' +
          filterValue;

        $.ajax({
          url: mysfitsApi,

          type: 'GET',

          success: function(response) {
            applyGridScope(response.mysfits);
          },

          error: function(response) {
            console.log('could not retrieve mysfits list.');
          },
        });
      };
    });

    app.controller('mysfitsListController', function($scope) {
      gridScope = $scope;

      getAllMysfits(applyGridScope);
    });

    function applyGridScope(mysfitsList) {
      gridScope.mysfits = mysfitsList;

      gridScope.$apply();
    }

    function getAllMysfits(callback) {
      var mysfitsApi = mysfitsApiEndpoint + '/mysfits';

      $.ajax({
        url: mysfitsApi,

        type: 'GET',

        success: function(response) {
          callback(response.mysfits);
        },

        error: function(response) {
          console.log('could not retrieve mysfits list.');
        },
      });
    }
  </script>
</html>

Save the index.html file.

Step 2 - Build and run your Sample app using Docker and NGINX

Next, build the Dockerfile by running the following:

BASH

docker build -t simple-nginx .

Now we can run our image by using

BASH

docker run -d -p 8080:80 simple-nginx

Step 3 - View your Sample app

Now you can see your sample running @ <span class="code-class-custom">your_server_ip:8080</span>

s3

Step 4 - View your Sample app via VNC

The Gremlin Unavailable Dependencies Scenario uses a Blackhole attack. We will be using this Blackhole attack to disallow images stored in S3 from loading. To see the results of this Blackhole Network attack we will be using a service called VNC.

On your host, install the Xfce and TightVNC packages:

BASH

sudo apt-get update

sudo apt install xfce4 xfce4-goodies tightvncserver

To complete the VNC installation run the following command, you will be prompted to enter a password:

BASH

vncserver

Next, test the VNC connection on your local computer. Run the following command which uses port forwarding :

BASH

ssh -L 5901:127.0.0.1:5901 -N -f -l username server_ip_address

Now you can use a VNC client to connect to the VNC server at <span class="code-class-custom">localhost:5901</span>. You’ll be prompted to authenticate. Use the password you set up earlier. You can use the built-in program for Mac called Screen Sharing or VNC Viewer to view your Xfce Desktop.

On your host you will need to ensure you have a browser installed, install Firefox by running the following command:

BASH

apt-get install firefox

Before you move onto the next step, ensure that you are able to view the sample app using VNC. Connect to <span class="code-class-custom">localhost:5901</span> and click on <span class="code-class-custom">Applications > Internet > Firefox Web Browser</span>:

S3

Using Firefox, navigate to localhost:8080, you should see the following:

S3

Now we’re ready to run the Gremlin Unavailable Dependency Scenario to reproduce the S3 Outage.

Step 5 - Running the Gremlin Unavailable Dependencies Scenario to reproduce the S3 outage

First, navigate to Recommended Scenarios within the Gremlin UI and choose the Unavailable Dependency Scenario:

S3

Next, select Add Targets and run. Then select your host using the local-hostname option:

S3

Then click customize:

S3

Next, you will click Add Attacks, this will take you to the Gremlin Attack configuration. To reproduce the S3 outage modify the default scenario to include 1 x Blackhole attack impacting AWS S3 us-east-1. You will need to make these changes in the Providers section of the Blackhole attack, see the screenshot below:

S3

Next, click Unleash Scenario. Your Gremlin Scenario will now be running and it will begin to reproduce the S3 Outage:

S3

Step 6 - Viewing the result of the Gremlin Unavailable Dependencies Scenario

To view the S3 Outage being reproduced open your VNC viewer and reload your Firefox tab, you will notice that the images stored in us-east-1 on S3 no longer load:

S3

Conclusion

This tutorial has demonstrated how you can use the Gremlin Recommended Scenario “Unavailable Dependency” to reproduce the AWS S3 Outage. This demonstrates that our sample app is not resilient to this outage.

To ensure we could reliably handle this scenario, we could run this Gremlin Scenario again after rectifying this situation with one of many options:

  • S3 failover
  • Multi-cloud storage
  • Multi-CDN
  • Handling image errors in the browser using React
No items found.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your trial

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.GET STARTED

Product Hero ImageShape