How to use your Gremlin reliability score in Jenkins to ensure reliable releases

Andre Newman
Sr. Reliability Specialist
Last Updated:
November 20, 2023
Categories:
Chaos Engineering
,
Reliability Management
,

Introduction

Adding Gremlin to your CI/CD pipeline is a key step in automating your reliability efforts. We previously wrote a tutorial on how to run a Chaos Engineering experiment as part of a Jenkins pipeline. The result ran a chaos experiment every time you deployed your code to a test environment. But this approach has a limitation: you have to either wait for the test to finish and check the results programmatically, or allow the build process to continue regardless of the results.

This tutorial expands on the previous one by using the Gremlin reliability score, which is a more proactive indicator of reliability. A reliability score is calculated by running a series of experiments (called a Test Suite). The main benefits are:

  1. We can run these experiments at any time, not just at deployment time.
  2. The score is standardized across all services, so we can set a single minimum score to apply to all services.

In this tutorial, we'll create a complete Jenkins pipeline that checks a service's reliability score using the Gremlin API. We'll compare the score against a required minimum score, and if it passes, we'll promote it to production. You'll learn how to create API keys in Gremlin and use the Gremlin API. And while this tutorial uses code specific for Jenkins, you can use the same concepts with any CI/CD tool.

Overview

This tutorial will show you how to:

  • Use the Gremlin REST API
  • Create a Jenkins Pipeline using Groovy
  • Check and compare a service's reliability score using the Gremlin API and Groovy

Prerequisites

Before starting this tutorial, you’ll need the following:

Step 1 - Download the Jenins pipeline template

The first step is to define the Jenkins pipeline. We already wrote a simple Groovy file that you can download from GitHub. Copy and paste the contents of the file to your computer, or use the "Download raw file" button. Alternatively, you can copy the file contents from the code block below:

GROOVY

/*
This Pipeline example demonstrates how to use the Gremlin API to check the Gremlin Score of a service
before promoting it to production. The Gremlin Score is a measure of the reliability of a service.
If the Gremlin Score is less than the value set, the pipeline will fail and the service will not be promoted to production,
 */

pipeline {
    agent any

    stages {
        stage('Check Gremlin Score') {
            steps {
                script {
                    def serviceId = 'Replace with your service ID'
                    def teamId = 'Replace with your team ID'
                    def apiUrl = "https://api.gremlin.com/v1/services/${serviceId}/score?teamId=${teamId}"
                    def apiToken = 'Bearer Replace with your Bearer token or API token'
                    def minScore = 80.0 // Replace with your minimum Gremlin Score

                    def response = sh(script: "curl -s -X GET '${apiUrl}' -H 'Authorization: ${apiToken}' -H 'accept: application/json'", returnStatus: true)

                    if (response != 0) {
                        error("API call to Gremlin failed with status code: ${response}")
                    } else {
                        def apiResponse = sh(script: "curl -s -X GET '${apiUrl}' -H 'Authorization: ${apiToken}' -H 'accept: application/json'", returnStdout: true).trim()

                        echo "API Response: ${apiResponse}" // Debug logging

                        // Attempt to capture numbers using a permissive regex
                        def scoreMatches = (apiResponse =~ /(\d+(\.\d+)?)/)

                        if (scoreMatches) {
                            def score = null

                            for (match in scoreMatches) {
                                def potentialScore = match[0]
                                try {
                                    score = Float.parseFloat(potentialScore)
                                    break
                                } catch (NumberFormatException e) {
                                    // Continue searching for a valid score
                                }
                            }

                            if (score != null) {
                                echo "Gremlin Score: ${score}" // Debug logging

                                if (score < minScore)
                                    error("Gremlin Score ${score} is less than defined ${minScore}. Cannot promote to production.")
                                }
                        } else {
                            echo "No valid score found in API response." // Debug logging
                            error("Unable to extract Gremlin Score from the API response.")
                        }
                    }
                }
            }
        }

        stage('Promote to Production') {
            steps {
                // Add the steps to promote to production here
                // This could involve deployment and other production-related tasks
                // You can replace this comment with the actual steps for your deployment process
                echo "Promoting to production..."
            }
        }
    }

    post {
        failure {
            echo "The pipeline has failed. Not promoting to production."
        }
        success {
            echo "The pipeline has succeeded. Promoting to production."
        }
    }
}

Step 2 - Create a Gremlin API key and add it to the file

In order to use Gremlin's REST API, we need to add our authentication details to the script. You'll need two things:

  1. A Gremlin API key (you can create a new one or reuse an existing one), and
  2. Your Gremlin team ID.
    Note
    You'll need either the Team Manager or Team User roleto view and create API keys.
  3. Log into the Gremlin web app if you haven't yet.
  4. Open your account settings by clicking on this link, or by clicking the user icon in the top-right corner of the page and selecting Account Settings.
  5. Click API Keys. If you already have an API key you want to reuse, simply click the Copy icon next to the key name.
  6. If you want to create a new API key, click New API Key.
    1. Click Save, and copy the API key that appears in the modal.
    2. Enter a name for the API key, such as "Jenkins score check". You can also enter a more detailed description, but this is optional.

Once you have the API key, paste it into the following line in the <span class="code-class-custom">releasePipeline.groovy</span> file:

GROOVY

def apiToken = 'Bearer Replace with your Bearer token or API token'

Save the file.

Step 3 - Retrieve your Gremlin team ID and service ID

You'll need two additional pieces of information from Gremlin: your team ID and the service ID. The team ID is the unique ID for your Gremlin team, and the service ID is the unique ID of the service you want to check the score for.

We'll start with the team ID. To get the team ID, look in the bottom-left corner of the Gremlin web app. You'll see your name, and underneath that, your team name. Click the icon next to the team name to copy your team ID to your clipboard. From there, open your <span class="code-class-custom">releasePipeline.groovy</span> file and paste it in the following line:

GROOVY

def teamId = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

For the service ID:

  1. Find the service in the Services list, then click on it to see its overview page.
    Note
    If this service is running in production and its score drops below the threshold, it's possible that this script will prevent you from deploying updates. To avoid this, consider using a pre-production environment to test the service and check its score. An easy way to differentiate a non-production instance of service from a production instance is by using the production flag.
  2. Click Settings at the top of the page.
  3. Under Details, look for the Service ID box. Highlight and copy the ID shown in the box, or click the icon at the right side to copy the ID directly to your clipboard.
  4. Paste the ID on the following line:
GROOVY

def serviceId = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

Save the file.

Step 4 - Add the Groovy file to your Jenkins pipeline

In this step, we'll create a pipeline using our Groovy file. But before we do, there's one last tweak we need to make: we need to set the score threshold.

The score threshold is the minimum reliability score the service must have before it can deploy to production. This is defined in the <span class="code-class-custom">minScore</span> variable. In the sample file, we set <span class="code-class-custom">minScore = 80.0</span>, which means the service must have a score of at least 80% to deploy. Anything below this score will stop the pipeline and raise an error. You can change this threshold to any value between 0 and 100 by editing this line:

GROOVY

def minScore = 80.0 // Replace with your minimum Gremlin Score

Now we're ready to add this file to our Jenkins Pipeline. To do this:

  1. Open your Jenkins web application.
  2. From the Dashboard, click on New Item.
  3. Enter a name for the pipeline (e.g. "[service name]-gremlin-release-gate").
  4. Select Pipeline as the type, then click OK.
  5. Click the Pipeline tab at the top of the page to scroll down to the Pipeline section.
  6. Enter the contents of the Groovy file in the Script text area.
  7. Click Save.

Step 5 - Run your Jenkins pipeline

After you click Save in the previous step, click Build Now to run the pipeline. Gremlin will retrieve the service's score, check if its value is greater than or equal to <span class="code-class-custom">minScore</span>, and if so, will mark the build as successful. Otherwise, it will mark it as failed.

From here, you can make changes to better integrate the pipeline into your build process. Instead of hard-coding values like your service ID, use environment variables instead so you can pass different IDs for each service, and use credentials for storing your Gremlin API key.

We've also included a section in the Groovy script where you can enter commands for deploying your service to production. This runs immediately after Jenkins compares the service's reliability score against <span class="code-class-custom">minScore</span>:

GROOVY

stage('Promote to Production') {
    steps {
        // Add the steps to promote to production here
        // This could involve deployment and other production-related tasks
        // You can replace this comment with the actual steps for your deployment process
        echo "Promoting to production..."
    }
}

Lastly, you can change the "failure" condition to perform other steps, such as notifying the service's owner by sending an email or calling a service like PagerDuty. You can also track the status of your builds by integrating with a monitoring tool like Datadog and alert on failed builds that way.

Conclusion

Congratulations on setting up a reliability gate in Jenkins! This will ensure that your service only gets pushed to production if it meets your minimum reliability scores.

To ensure your scores stay up to date, make sure to autoschedule reliability tests on your service to run at least once a week. Going longer than one week without re-running a test will cause that test to expire, reducing your score. Remember that you can also use the Run All button to re-run all of the service's tests and regenerate its score.

No items found.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.
start your trial

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.GET STARTED

Product Hero ImageShape