Chaos Monkey
Chaos Monkey Tutorial

A Step-by-Step Guide to Creating Failure on AWS

Last Updated
October 17, 2018

This chapter will provide a step-by-step guide for setting up and using Chaos Monkey with AWS. We also examine the scenarios where Chaos Monkey is the right solution, and its limitations since it only handles random instance terminations.

How to Quickly Deploy Spinnaker for Chaos Monkey

Modern Chaos Monkey requires the use of Spinnaker, which is an open-source, multi-cloud continuous delivery platform developed by Netflix. Spinnaker allows for automated deployments across multiple cloud platforms (such as AWS, Azure, Google Cloud Platform, and more). Spinnaker can also be used to deploy across multiple accounts and regions, often using pipelines that define a series of events that should occur every time a new version is released. Spinnaker is a powerful tool, but since both Spinnaker and Chaos Monkey were developed by and for Netflix's own architecture, you'll need to do the extra legwork to configure Spinnaker to work within your application and infrastructure.

In this first section we'll explore the fastest and simplest way to get Spinnaker up and running, which will then allow you to move onto installing and then using.

We'll be deploying Spinnaker on AWS, and the easiest method for doing so is to use the CloudFormation Quick Start template.

Looking to Deploy Spinnaker In Another Environment?
If you're looking for the utmost control over your Spinnaker deployment you should check out our How to Deploy a Spinnaker Stack for Chaos Monkeyguide, which provides a step-by-step tutorial for setting up Halyard and Spinnaker on a local or virtual machine of your choice.

The AWS Spinnaker Quick Start will create a simple architecture for you containing two subnets (one public and one private) in a Virtual Private Cloud (VPC). The public subnet contains a Bastion host instance designed to be strictly accessible, with just port 22 open for SSH access. The Bastion host will then allow a pass through connection to the private subnet that is running Spinnaker.


AWS Spinnaker Quick Start Architecture - Courtesy of AWS

This quick start process will take about 10 - 15 minutes and is mostly automated.

Creating the Spinnaker Stack

  1. (Optional) If necessary, visit to sign up for or login to your AWS account.
  2. (Optional) You'll need at least one AWS EC2 Key Pair for securely connecting via SSH.
  3. ^If you don't have a KeyPair already start by opening the AWS Console and navigate to EC2 > NETWORK & SECURITY > Key Pairs.
  4. ^Click Create Key Pair and enter an identifying name in the Key pair name field.
  5. ^Click Create to download the private <span class="code-class-custom">.pem</span> key file to your local system.
  6. ^Save this key to an appropriate location (typically your local user <span class="code-class-custom">~/.ssh directory</span>).
  7. After you've signed into the AWS console visit this page, which should contain a link to the <span class="code-class-custom">quickstart-spinnakercf.template</span>.
  8. Click Next.
  9. (Optional) If you haven't already done so, you'll need to create at least one AWS Access Key.
  10. Select the KeyName of the key pair you previously created.
  11. Input a secure password in the Password field.
  12. (Optional) Modify the IP address range in the SSHLocation field to indicate what IP range is allowed to SSH into the Bastion host. For example, if your public IP address is <span class="code-class-custom"></span> you might enter <span class="code-class-custom"></span> into this field. If you aren't sure, you can enter <span class="code-class-custom"></span> to allow any IP address to connect, though this is obviously less secure.
  13. Click Next.
  14. (Optional) Select an IAM Role with proper CloudFormation permissions necessary to deploy a stack. If you aren't sure, leave this blank and deployment will use your account's permissions.
  15. Modify any other fields on this screen you wish, then click Next to continue.
  16. Check the I acknowledge that AWS CloudFormation might create IAM resources with custom names. checkbox and click Create to generate the stack.
    <div fs-richtext-component="block-content" class="block-in-rich-text-amber margin-top-xs">
    If your AWS account already contains the <span class="code-class-custom">BaseIAMRole</span> AWS::IAM::Role you may have
    to delete it before this template will succeed.
  17. Once the <span class="code-class-custom">Spinnaker</span> stack has a <span class="code-class-custom">CREATE_COMPLETE</span> Status, select the Outputs tab, which has some auto-generated strings you'll need to paste in your terminal in the next section.

Connecting to the Bastion Host

  1. Copy the Value of the SSHString1 field from the stack Outputs tab above.
  2. Execute the SSHString1 value in your terminal and enter yes when prompted to continue connecting to this host.
    ssh -A -L 9000:localhost:9000 -L 8084:localhost:8084 -L 8087:localhost:8087 ec2-user@
    Permission denied (publickey)
    If you received a permission denied SSH error you may have forgotten to place your .pem private key file that you downloaded from the AWS EC2 Key Pair creation page. Make sure it is located in your ~/.ssh user directory. Otherwise you can specify the key by adding an optional -i flag, indicating the path to the .pem file.
  3. You should now be connected as the ec2-user to the Bastion instance. Before you can connect to the Spinnaker instance you'll probably need to copy your .pem file to the Spinnaker instance's ~/.ssh directory.
    Once the key is copied, make sure you set proper permissions otherwise SSH will complain.
    chmod 400 ~/.ssh/my_key.pem

Connecting to the Spinnaker Host

  1. To connect to the Spinnaker instance copy and paste the SSHString2 Value into the terminal.
    ssh –L 9000:localhost:9000 -L 8084:localhost:8084 -L 8087:localhost:8087 ubuntu@ -i ~/.ssh/my_key.pem
  2. You should now be connected to the SpinnakerWebServer!
    System restart required
    Upon connecting to the Spinnaker instance you may see a message indicating the system needs to be restarted. You can do this through the AWS EC2 console, or just enter the sudo reboot command in the terminal, then reconnect after a few moments.

Configuring Spinnaker

The Spinnaker architecture is composed of a collection of microservices that each handle various aspects of the entire service. For example, Deck is the web interface you'll spend most time interacting with, Gate is the API gateway that handles most communication between microservices, and CloudDriver is the service that communicates and configures all cloud providers Spinnaker is working with.

Since so much of Spinnaker is blown out into smaller microservices, configuring Spinnaker can require messing with a few different files. If there's an issue you'll likely have to look through individual logs for each different service, depending on the problem.

Spinnaker is configured through <span class="code-class-custom">/opt/spinnaker/config/spinnaker.yml</span> file. However, this file will be overwritten by Halyard or other changes, so for user-generated configuration you should actually modify the <span class="code-class-custom">/opt/spinnaker/config/spinnaker-local.yml</span> file. Here's a basic example of what that file looks like.


# /opt/spinnaker/config/spinnaker-local.yml

    timezone: 'America/Los_Angeles'

    # For more information on configuring Amazon Web Services (aws), see

    enabled: ${SPINNAKER_AWS_ENABLED:false}
    defaultRegion: ${SPINNAKER_AWS_DEFAULT_REGION:us-west-2}
    defaultIAMRole: Spinnaker-BaseIAMRole-GAT2AISI7TMJ
      name: default
      # Store actual credentials in $HOME/.aws/credentials. See spinnaker.yml
      # for more information, including alternatives.

    # {{name}} will be interpolated with the aws account name (e.g. "my-aws-account-keypair").
    defaultKeyPairTemplate: '{{name}}-keypair'
    # ...

Standalone Spinnaker installations (such as the one created via the AWS Spinnaker Quick Start) are configured directly through the <span class="code-class-custom">spinnaker.yml</span> and <span class="code-class-custom">spinnaker-local.yml</span> override configuration files.

Creating an Application

In this section we'll manually create a Spinnaker application containing a pipeline that first bakes a virtual machine image and then deploys that image to a cluster.

  1. Open the Spinnaker web UI (Deck) and click Actions > Create Application.
  2. Input bookstore in the Name field.
  3. Input your own email address in the Owner Email field.
  4. (Optional) If you've enabled Chaos Monkey in Spinnaker you can opt to enable Chaos Monkey by checking the Chaos Monkey > Enabled box.
  5. Input My bookstore application in the Description field.
  6. Under Instance Health, tick the Consider only cloud provider health when executing tasks checkbox.
  7. Click Create to add your new application.


Adding a Firewall

  1. Navigate to the bookstore application, INFRASTRUCTURE > FIREWALLS, and click Create Firewall.
  2. Input dev in the Detail field.
  3. Input Bookstore dev environment in the Description field.
  4. Within the VPC dropdown select SpinnakerVPC.
  5. Under the Ingress header click Add new Firewall Rule. Set the following Firewall Rule settings.
  6. Firewall: default
    Protocol: TCP
    Start Port: 80
    End Port: 80
  7. Click the Create button to finalize the firewall settings.


Adding a Load Balancer

  1. Navigate to the bookstore application, INFRASTRUCTURE > LOAD BALANCERS, and click Create Load Balancer.
  2. Select Classic (Legacy) and click Configure Load Balancer.
  3. Input test in the Stack field.
  4. In the VPC Subnet dropdown select internal (vpc-...).
  5. In the Firewalls dropdown select bookstore--dev (...).
  6. Click Create to generate the load balancer.


Creating a Pipeline in Spinnaker

The final step is to add a pipeline, which is where we tell Spinnaker what it should actually "do"! In this case we'll tell it to bake a virtual machine image containing Redis, then to deploy that image to our waiting EC2 instance.

  1. Navigate to the bookstore application, PIPELINES and click Create Pipeline.
  2. Select Pipeline in the Type dropdown.
  3. Input Bookstore Dev to Test in the Pipeline Name field.
  4. Click Create.

Adding a Bake Stage

  1. Click the Add stage button.
  2. Under Type select Bake.
  3. Input redis-server in the Package field.
  4. Select trusty (v14.04) in the Base OS dropdown.
  5. Click Save Changes to finalize the stage.

Ignoring Jenkins/Travis
In production environments you'll likely also want to incorporate Travis, Jenkins, or another CI solution as a preceding stage to the bake stage. Otherwise, Spinnaker will default to baking and deploying the most recently built package. For our purposes here we don't care, since we're using an unchanging image.

Adding a Deploy Stage

  1. Click the Add stage button.
  2. Under Type select Deploy.
  3. Click the Add server group button to begin creating a new server group.

Adding a Server Group

  1. Select <span class="code-class-custom">internal (vpc-...)</span> in the VPC Subnet dropdown.
  2. Input <span class="code-class-custom">dev</span> in the Stack field.
  3. Under Load Balancers > Classic Load Balancers select the <span class="code-class-custom">bookstore-dev</span> load balancer we created.
  4. Under Firewalls > Firewalls select the <span class="code-class-custom">bookstore--dev</span> firewall we also created.
  5. Under Instance Type select the Custom Type of instance you think you'll need. For this example we'll go with something small and cheap, such as <span class="code-class-custom">t2.large</span>.
  6. Input <span class="code-class-custom">3</span> in the Capacity > Number of Instances field.
  7. Under Advanced Settings > Key Name select the key pair name you used when deploying your Spinnaker CloudFormation stack.
  8. In the Advanced Settings > IAM Instance Profile field input the Instance Profile ARN value of the <span class="code-class-custom">BaseIAMRole</span> found in the AWS > IAM > Roles > BaseIAMRole dialog (e.g. <span class="code-class-custom">arn:aws:iam::0123456789012:instance-profile/BaseIAMRole</span>).
  9. We also need to ensure the <span class="code-class-custom">user/Spinnaker-SpinnakerUser</span> that was generated has permissions to perform to pass the <span class="code-class-custom">role/BasIAMRole</span> role during deployment.
  10. ^Navigate to AWS > IAM > Users > Spinnaker-SpinnakerUser-### > Permissions.
  11. ^Expand <span class="code-class-custom">Spinnakerpassrole</span> policy and click Edit Policy.
  12. ^Select the JSON tab and you'll see the auto-generated <span class="code-class-custom">Spinnaker-BaseIAMRole</span> listed in <span class="code-class-custom">Resources</span>.
  13. ^Convert the <span class="code-class-custom">Resource</span> key value to an array so you can add a second value. Insert the ARN for the <span class="code-class-custom">role/BaseIAMRole</span> of your account (the account number will match the number above).
  14. ^Click Review Policy and Save Changes.
  15. Click the Add button to create the deployment cluster configuration.
  16. Finally, click Save Changes again at the bottom of the Pipelines interface to save the full <span class="code-class-custom">Configuration > Bake > Deploy</span> pipeline.
  17. You should now have a <span class="code-class-custom">Bookstore Dev to Test</span> two-stage pipeline ready to go!

Executing the Pipeline

  1. Navigate to the bookstore application, select Pipelines, and click Start Manual Execution next to the Bookstore Dev to Test pipeline.
  2. Click Run to begin manual execution.
  3. After waiting a few moments, assuming none of the potential setbacks below bite you, you'll shortly see output indicating the bookstore-dev-v000 Server Group has been successfully created. Browse over to AWS > EC2 and you'll see your three new instances launched!

To resize this Server Group use the Resize Server Group dialog in Spinnaker. Alternatively, you can find additional options under Server Group Actions, such as Disable or Destroy to stop or terminate instances, respectively.

Troubleshooting Pipeline Executions

While a lot can go wrong, below are a few potential issues you may encounter running through this tutorial, depending on changes to software versions, default configurations, and the like between present and time of writing.

Error: Unknown configuration key ena_support
If you get an ena_support error during deployment (see: #2237) the solution is to _remove_ the ena_support reference line within the builders block in the /opt/rosco/config/packer/aws-ebs.json Rosco configuration file.


sudo nano /opt/rosco/config/packer/aws-ebs.json


{ "builders": { "aws_ena_support": "{% raw %}{{user `aws_ena_support`}}{% endraw %}", }, }

Error: 0.000000`is an invalid spot instance price
If you get such an error during deployment (see: ena_support error during deployment (see: #2889 )the solution is to remove spot_price reference lines within the builders block in the /opt/rosco/config/packer/aws-ebs.json Rosco configuration file.


sudo nano /opt/rosco/config/packer/aws-ebs.json


{ "builders": { "spot_price": "{% raw %}{{user `aws_spot_price`}}{% endraw %}", "spot_price_auto_product": "{% raw %}{{user `aws_spot_price_auto_product`}}{% endraw %}", }, }

Error: Bake stage failure after provisioning script
This error is typically due to an outdated script. To resolve this override with the latest downloaded version.


sudo curl --output /opt/rosco/config/packer/

How to Install Chaos Monkey

Before you can use Chaos Monkey you'll need to have Spinnaker deployed and running. We've created a handful of step-by-step tutorials for deploying Spinnaker, depending on the environment and level of control you're looking for.

Installing MySQL

Chaos Monkey requires MySQL, so make sure it's installed on your local system.

Chaos Monkey is currently incompatible with MySQL version 8.0 or higher, so 5.X is recommended.

  1. Download the latest mysql-apt.deb file from the official website, which we'll use to install MySQL
    curl -OL
  2. Install mysql-server by using the dpkg command.
    sudo dpkg -i mysql-apt-config_0.8.10-1_all.deb
  3. In the UI that appears press enter to change the MySQL Server & Cluster version to mysql-5.7. Leave the other options as default and move down to Ok and press Enter to finalize your choice.
  4. Now use sudo apt-get update to update the MySQL packages related to the version we selected (mysql-5.7, in this case).
    sudo apt-get update
  5. Install mysql-server from the packages we just retrieved. You'll be prompted to enter a root password.
    sudo apt-get install mysql-server
  6. You're all set. Check that MySQL server is running with systemctl.
    systemctl status mysql
  7. (Optional) You may also wish to issue the mysql_secure_installation command, which will walk you through a few security-related prompts. Typically, the defaults are just fine.

Setup MySQL for Chaos Monkey

We now need to add a MySQL table for Chaos Monkey to use and create an associated user with appropriate permissions.

  1. Launch the mysql CLI as the root user.
    mysql -u root -p
  2. Create a chaosmonkey database for Chaos Monkey to use.
    CREATE DATABASE chaosmonkey;
  3. Add a chaosmonkey MySQL user.
    CREATE USER 'chaosmonkey'@'localhost' IDENTIFIED BY 'password';
  4. Grant all privileges in the chaosmonkey database to the new chaosmonkey user.
    GRANT ALL PRIVILEGES ON chaosmonkey.* TO 'chaosmonkey'@'localhost';
  5. Finally, save all changes made to the system.

Installing Chaos Monkey

  1. (Optional) Install go if you don't have it on your local machine already.
    Go to this download page and download the latest binary appropriate to your environment.
    curl -O
    Extract the archive to the /usr/local directory.
    sudo tar -C /usr/local -xzf go1.11.linux-amd64.tar.gz
    Add /usr/local/go/bin to your $PATH environment variable.
    export PATH=$PATH:/usr/local/go/bin
    echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
  2. (Optional) Check if the $GOPATH and $GOBIN variables are set with echo $GOPATH and echo $GOBIN. If not, export them and add them to your bash profile.
    export GOPATH=$HOME/go
    echo 'export GOPATH=$HOME/go' >> ~/.bashrc
    export GOBIN=$HOME/go/bin
    echo 'export GOBIN=$HOME/go/bin' >> ~/.bashrc
    export PATH=$PATH:$GOBIN
    echo 'export PATH=$PATH:$GOBIN' >> ~/.bashrc
  3. Install the latest Chaos Monkey binary.
    go get

Configure Spinnaker for Chaos Monkey

Spinnaker includes the Chaos Monkey feature as an option, but it is disabled by default.

  1. (Optional) If necessary, enable the Chaos Monkey feature in your Spinnaker deployment.
    On a Halyard-based Spinnaker deployment you must enable the Chaos Monkey feature via the Halyard --chaos flag.
    hal config features edit --chaos true
    On a quick start Spinnaker deployment you'll need to manually enable the Chaos Monkey feature flag within the /opt/deck/html/settings.js file. Make sure the var chaosEnabled is set to true, then save and reload Spinnaker.
    sudo nano /opt/deck/html/settings.js
    // var chaosEnabled = ${services.chaos.enabled};
    var chaosEnabled = true
  2. Navigate to Applications > (APPLICATION_NAME) > CONFIG and select CHAOS MONKEY in the side navigation.
  3. Check the Enabled box to enable Chaos Monkey.
  4. The UI provides useful information for what every option does, but the most important options are the mean and min times between instance termination. If your setup includes multiple clusters or stacks, altering the grouping may also make sense. Finally, you can add exceptions as necessary, which acts as a kind of exclude list of instances that will be ignored by Chaos Monkey, so you can keep the most critical services up and running.
  5. Once your changes are made click the Save Changes button.

How to Configure Chaos Monkey

  1. Start by creating the chaosmonkey.toml, which Chaos Monkey will try to find in all of the following locations, until a configuration file is found:
    (current directory)
    Generally, if you're configuring multiple Chaos Monkey installations on the same machine you should use application-specific configurations, so putting them in separate directories is ideal. However, if you're just using one installation on the machine then /apps/chaosmonkey/chaosmonkey.toml works well.
  2. Add the following basic configuration structure to your chaosmonkey.toml file, replacing appropriate configuration values with your own settings.
    enabled = true
    schedule_enabled = true
    leashed = false
    accounts = ["aws-primary"]
    start_hour = 9      # time during day when starts terminating
    end_hour = 15       # time during day when stops terminating
    # location of command Chaos Monkey uses for doing terminations
    term_path = "/apps/chaosmonkey/"
    # cron file that Chaos Monkey writes to each day for scheduling kills
    cron_path = "/etc/cron.d/chaosmonkey-schedule"
    host = "localhost"
    name = "<DATABASE_NAME>"
    user = "<DATABASE_USER>"
    encrypted_password = "<DATABASE_USER_PASSWORD>"
    endpoint = "http://localhost:8084"
  3. With Chaos Monkey configured it's time to migrate it to the MySQL
    chaosmonkey migrate
    [16264] 2018/09/04 14:11:16 Successfully applied database migrations. Number of migrations applied:  1
    [16264] 2018/09/04 14:11:16 database migration applied successfully

Error: 1298: Unknown or incorrect time zone: 'UTC'
If you experience a timezone error this typically indicates a configuration problem with MySQL. Just run the mysql_tzinfo_to_sql command to update your MySQL installation.


mysql_tzinfo_to_sql /usr/share/zoneinfo/ | mysql -u root mysql -p

How to Use Chaos Monkey

Using the <span class="code-class-custom">chaosmonkey</span> command line tool is fairly simple. Start by making sure it can connect to your <span class="code-class-custom">spinnaker</span> instance with <span class="code-class-custom">chaosmonkey config spinnaker</span>.


chaosmonkey config spinnaker


 Enabled: (bool) true,
 RegionsAreIndependent: (bool) true,
 MeanTimeBetweenKillsInWorkDays: (int) 2,
 MinTimeBetweenKillsInWorkDays: (int) 1,
 Grouping: (chaosmonkey.Group) cluster,
 Exceptions: ([]chaosmonkey.Exception) {
 Whitelist: (*[]chaosmonkey.Exception)()

Track Kubernetes Nodes
If you're running Spinnaker on Kubernetes you can use the kubectl get nodes --watch command to keep track of your Kubernetes nodes while running Chaos Experiments.


kubectl get nodes --watch
# OUTPUT Ready  3d v1.10.3 | Ready  3d v1.10.3 | Ready  3d v1.10.3

To manually terminate an instance with Chaos Monkey use the <span class="code-class-custom">chaosmonkey terminate</span> command.


chaosmonkey terminate   [--region=] [--stack=] [--cluster=] [--leashed]

For this example our application is <span class="code-class-custom">spinnaker</span> and our account is <span class="code-class-custom">aws-primary</span>, so using just those two values and leaving the rest default should work.


chaosmonkey terminate spinnaker aws-primary


[11533] 2018/09/08 18:39:26 Picked: {spinnaker aws-primary us-west-2 eks spinnaker-eks-nodes-NodeGroup-KLBYTZDP0F89 spinnaker-eks-nodes-NodeGroup-KLBYTZDP0F89 i-054152fc4ed41d7b7 aws}

Now look at the AWS EC2 console (or at the terminal window running <span class="code-class-custom">kubectl get nodes --watch</span>) and after a moment you'll see one of the instances has been terminated.

BASH   Ready         3d        v1.10.3   Ready         3d        v1.10.3   NotReady       3d        v1.10.3   Ready         3d        v1.10.3   Ready         3d        v1.10.3

If you quickly open up the Spinnaker Deck web interface you'll see only two of the three instances in the cluster are active, as we see in <span class="code-class-custom">kubectl</span> above. However, wait a few more moments and Spinnaker will notice the loss of an instance, recognize it has been stopped/terminated due to an EC2 health check, and will immediately propagate a new instance to replace it, thus ensuring the server group's desired capacity remains at <span class="code-class-custom">3</span> instances.

For Kubernetes Spinnaker deployments, a <span class="code-class-custom">kubectl get nodes --watch</span> output confirms these changes (in this case, the new local <span class="code-class-custom"></span> instance was added).

BASH   Ready     <none>    3d        v1.10.3   Ready     <none>    3d        v1.10.3   NotReady  <none>    10s       v1.10.3   Ready     <none>    3d        v1.10.3   Ready     <none>    3d        v1.10.3   Ready     <none>    20s       v1.10.3

Spinnaker also tracks this information. Navigating to the your Spinnaker application INFRASTRUCTURE > CLUSTERS > <span class="code-class-custom">spinnaker-eks-nodes-NodeGroup </span> > CAPACITY and click View Scaling Activities to see the Spinnaker scaling activities log for this node group. In this case we see the successful activities that lead to the health check failure and new instance start.


How to Schedule Chaos Monkey Terminations

Before we get to scheduling anything you'll want to copy the <span class="code-class-custom">chaosmonkey</span> executable to the <span class="code-class-custom">/apps/chaosmonkey</span> directory. While you can leave it in the default <span class="code-class-custom">$GOBIN</span> directory, it'll be easier to use with cron jobs and other system commands if it's in a global location.


sudo cp ~/go/bin/chaosmonkey /apps/chaosmonkey/

Now that we've confirmed we can manually terminate instances via Chaos Monkey you may want to setup an automated system for doing so. The primary way to do this is to create a series of scripts that regenerate a unique <span class="code-class-custom">crontab</span> job that is scheduled to execute on a specific date and time. This cron job is created every day (or however often you like), and the execution time is randomized based on the <span class="code-class-custom">start_hour</span>, <span class="code-class-custom">end_hour</span>, and <span class="code-class-custom">time_zone</span> settings in the <span class="code-class-custom">chaosmonkey.toml</span> configuration. We'll be using four files for this: Two crontab files and two bash scripts.

  1. Start by creating the four files we'll be using for this.
    sudo touch /apps/chaosmonkey/
    sudo touch /apps/chaosmonkey/
    sudo touch /etc/cron.d/chaosmonkey-schedule
    sudo touch /etc/cron.d/chaosmonkey-daily-scheduler
  2. Now set executable permissions for the two bash scripts so the cron (root) user can execute them.
    sudo chmod a+rx /apps/chaosmonkey/
    sudo chmod a+rx /apps/chaosmonkey/
  3. Now we'll add some commands to each script in the order they're expected to call one another. First, the /etc/cron.d/chaosmonkey-daily-scheduler is executed once a day at a time you specify. This will call the /apps/chaosmonkey/ script, which will perform the actual scheduling for termination. Paste the following into /etc/cron.d/chaosmonkey-daily-scheduler (as with any cron job you can freely edit the schedule to determine when the cron job should be executed).
    # min  hour  dom  month  day  user  command
    0      12    *    *      *    root  /apps/chaosmonkey/
  4. The /apps/chaosmonkey/ script should perform the actual chaosmonkey schedule command, so paste the following into /apps/chaosmonkey/
    /apps/chaosmonkey/chaosmonkey schedule >> /var/log/chaosmonkey-schedule.log 2>&1
  5. When the chaosmonkey schedule command is called by the /apps/chaosmonkey/ script it will automatically write to the /etc/cron.d/chaosmonkey-schedule file with a randomized timestamp for execution based on the Chaos Monkey configuration. Here's an example of what the generated /etc/cron.d/chaosmonkey-schedule looks like.
    # /etc/cron.d/chaosmonkey-schedule
    9 16 9 9 0 root /apps/chaosmonkey/ spinnaker aws-primary --cluster=spinnaker-eks-nodes-NodeGroup-KLBYTZDP0F89 --region=us-west-2
  6. Lastly, the /apps/chaosmonkey/ script that is called by the generated /etc/cron.d/chaosmonkey-schedule cron job should issue the chaosmonkey terminate command and output the result to the log. Paste the following into /apps/chaosmonkey/
    /apps/chaosmonkey/chaosmonkey terminate "$@" >> /var/log/chaosmonkey-terminate.log 2>&1

Next Steps

You're all set now! You should have a functional Spinnaker deployment with Chaos Monkey enabled, which will perform a cron job once a day to terminate random instances based on your configuration!

Chaos Monkey is just the tip of the Chaos Engineering iceberg, and there are a lot more failure modes you can experiment with to learn about your system.

The rest of this guide will cover the other tools in The Simian Army family, along with an in-depth look at the Chaos Monkey Alternatives. We built Gremlin to provide a production-ready framework to safely, securely, and easily simulate real outages with an ever-growing library of experiments.

This is some text inside of a div block.
Chaos Monkey
This is some text inside of a div block.