Continuous Chaos Engineering in your CI/CD Pipelines with Gremlin and Harness

Continuous Chaos Engineering in your CI/CD Pipelines with Gremlin and Harness

To move software changes from development through a build process to production is a journey that requires many steps. These steps build confidence in the quality of the code, but they are tedious to perform manually. Continuous integration/continuous delivery (CI/CD) solutions orchestrate and automate these steps, such as triggering traditional QA testing or performing Chaos Engineering experiments to ensure reliable builds prior to pushing to production.

Chaos Engineering is the science of performing intentional experimentation on a system by injecting precise and measured amounts of harm in order to observe how the system responds for the purpose of improving its resilience. With Chaos Engineering, we can proactively uncover and address failure modes in our systems in order to make those systems more resilient. Automating Chaos experiments with every build ensures that we prevent the drift into failure as our systems continually evolve.

In this tutorial, we will walk through the steps to leverage Harness, a continuous delivery platform, to install Gremlin onto a Kubernetes cluster and orchestrate Gremlin Attacks.

Prerequisites

  • A Harness Account, which can be the free tier
  • Access to a Kubernetes Cluster, such as this example Amazon EKS Cluster
  • A Gremlin Account which can be the free tier

Create your Gremlin Free account

Run your first Chaos Experiment in minutes.
Log in

Step 1 - Create your Gremlin API key

Navigate to your Gremlin Account. A great way to authenticate between external systems and Gremlin is by using an API Key. You can create one by navigating to Company -> Team -> API Keys. I have an existing key which I will use.

Get a Gremlin API key

In addition to the API Key, you also need to have your Team ID and Team Secret Key. This is found under Company -> Team -> Configuration. If you forgot your Team Secret Key, use the Reset button to generate a new one.

Gather your Team ID and Team Secret Key

Step 2 - Install a Harness Delegate

Open another tab in your browser and navigate to the Harness Platform.

Navigate to Harness

Harness uses a worker node model called Delegates. The Harness Delegates will perform work on your behalf and are needed to connect your artifacts, infrastructure, collaboration, verification, and other providers with the Harness Manager. In our example, we will leverage a Kubernetes Delegate.

Install a Harness Delegate in Kubernetes in the Harness Platform at Setup -> Harness Delegates -> Download Delegates -> Kubernetes YAML.

Install a Harness Delegate in Kubernetes

Download the YAML and give it a name you will remember, perhaps using ”gremlin” as a prefix as I do in the example.

Unpack the TAR archive file you downloaded.

Unpack the tar archive

Use kubectl to install the YAML file, like this:

bash
1kubectl apply -f harness-delegate.yaml

When the installation is complete, the Delegate will be listed in the Harness UI.

When the installation is complete, the Delegate will be listed in the Harness UI.

Next, add a Kubernetes Cluster to Harness. Open Setup -> Cloud Providers + Add Cloud Provider -> Kubernetes Cluster and select the entry to inherit the cluster details from the Delegate.

Select the entry to inherit the cluster details from the Delegate

Once you click submit, your Kubernetes cluster will be available.

The Kubernetes cluster will be available

With the Kubernetes Cluster available to Harness, now we can perform the Gremlin steps to install and launch an Attack.

Step 3 - Connect Gremlin and Harness

To install Gremlin we will connect the Gremlin Helm repository with Harness. See the Gremlin Helm Chart Values in GitHub to determine whether modifications are needed beyond our example.

Open Setup -> Connectors -> Artifact Servers + Add Artifact Server to enter the repository URL, as shown. Use this URL: https://helm.gremlin.com.

Add Artifact Server to enter the repository URL

Step 4 - Prepare your Harness workflow

Harness works on an abstraction model where you can model all the pieces of your workflow. The basis of this abstraction is a Harness Application, which refers to your application.

Create a Harness Application to house your workflow at Setup -> Applications + Add Application and enter the required details.

Create a Harness Application to house your workflow

Next, add a Harness Service which will be a Kubernetes-based deployment. We will leverage this service to install Gremlin via Helm. Add the service at Setup -> Gremlin Chaos App -> Services and enter the details.

Add a Harness Service

With the service created, link the Gremlin Helm manifest as a remote manifest by clicking on the ellipses on the side. The chart name you should enter is gremlin.

Link the Gremlin Helm manifest as a remote manifest

Since we need to provide credentials to install and connect Gremlin, the easiest way will be to pass those key values to the Helm Chart. Your Gremlin Account details can be added to the Configuration section in Values YAML Override. You will also need your Kubernetes cluster name. The one in our example is gremlinchaos.

Update the configuration

Add Inline Values

Add inline values

The Values:

yaml
1gremlin:
2 secret:
3 teamID: YourTeamID
4 teamSecret: YourTeamSecret
5 managed: true
6 clusterID: gremlinchaos

With the Helm values entered, it is time to create a Harness Workflow to install Gremlin and run your Gremlin Attack.

Step 5 - Build your Harness Workflow

To begin, we must designate the Kubernetes Cluster as a Harness Environment.

In the Harness Platform go to Setup -> Gremlin Chaos App - > Environments + Add Environment to enter your details. Note that Environment Type is only used for label purposes; what you select will have no impact on the quality or usage of the environment.

Add Environment

Next, add an Infrastructure Definition inside the Environment. Make sure to set the Deployment Type to Kubernetes. You can but are not required to change your namespace to gremlin, like we do in our example.

Infrastructure Definition and set the Deployment Type to Kubernetes

Now we add a Harness Workflow to connect these steps at Setup -> Gremlin Chaos App -> Workflows + Add Workflow.

Add Workflow

In the Workflow, in the Rolling Deployment dropdown add a Verify Step. Let’s add a step called “Call Gremlin API” to validate that our Gremlin Installation was successful. Gremlin has a great list of API Call Examples to help if you want to explore what other options are available.

Add a Verify Step

From Add Step -> New Step -> Utility -> Shell Script enter a shell script, like the one in our example. Use the following format:

bash
1curl -X POST \
2 --header "Content-Type: application/json" \
3 --header "Authorization: Key $yourAPI_Key" \
4 https://api.gremlin.com/v1/attacks/new?teamId=$yourTeamID \
5 --data '
6 {
7 "command": { "type": "cpu", "args": ["-c", "1", "--length", "30"] },
8 "target": { "type": "Random" }
9 }'

Configure Call Gremlin API

Now that we are all set up, it’s time to run the attack.

Step 6 - Run the attack

Now is the time to see Chaos Engineering and continuous delivery working together.

To begin, start a new deployment in the Workflow Overview, as shown here in the “Chaos is Cool” Workflow UI.

New deployment

Alternatively, you can launch the attack by navigating to Continuous Deployment -> Start New Deployment then Select the Harness Application and Harness Workflow you created. In this case Gremlin Chaos App as the Application and Chaos is Cool as the Workflow.

Start a new Deployment

Watch the Harness UI for a status of Success.

Harness UI

While you wait, you can open the Gremlin UI and see the attack in progress.

Gremlin UI to see the attack in progress

We can confirm the CPU utilization spike in the Gremlin UI in the attack details at Gremlin Platform -> Attacks -> Completed.

Gremlin CPU charting

Congratulations, you just initiated and ran your very first Gremlin Attack using Harness!

Share the results of your chaos experiment with our community of over 5,000 engineers in the Chaos Engineering Slack.

Slack

Join the Chaos Engineering Slack

Connect with 5,000+ engineers who are building more reliable systems with Chaos Engineering.

Related

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Get started
  • TechCrunch
  • Forbes
  • Business Insider
  • VentureBeat


© 2020 Gremlin Inc. San Jose, CA 95113