Validating horizontal pod autoscaling on EKS with Gremlin

Using the Gremlin UI is an easy, visual way to run Chaos Experiments. But as your applications become more reliable and able to withstand attacks, automating your Chaos will prevent you from drifting back into failure. This tutorial will get you started using Gremlin's Python SDK for the Gremlin API.
To successfully complete this course, you’ll need:
The Gremlin Python SDK makes it easy to interact with Gremin's API. You can use it to get information about your account, launch attacks and see their status, create scenarios, and more. In this tutorial we'll be using the SDK to connect to Gremlin, launch an attack, get its status and return the results when it's finished.
Log in to Repl.it, Gremlin, and Datadog with the credentials provided.
After you've logged into Repl.it, click the "My Repls" link in the left side navigation, then open the Gremlin Python SDK project.
The project has already loaded the gremlinapi
python library and the code has imported the the modules that we'll be using for the tutorial.
To connect to the Gremlin API, you'll first need to set your Gremlin Team ID. In the Gremlin application, browse the user menu and Go to the Company Settings.
Select your team from the teams listing.
Then, click the "Configuration" tab to copy your Team ID and paste it into the code. For example:
1config.team_id = "648737d4-XXXX-XXXXX-XXXX-9e3ac67e287d"
You'll also need an API key to connect to Gremlin. In the "API Keys" tab, click the "New API Key" button. Give your new API Key a name and a description. Your API Key name and the description may only contain letters, numbers, hyphens, and spaces.
Copy your API Key and paste it into the code. Note that you can easily copy the api key by clicking the eye icon next to the API Key name, then clicking the "Click to copy" message that appears.
1config.api_key = "8fdb3329XXXXXXXXXXXXXXXXXXXXXX4b13e6cc"
After you've copied your Team ID and API Key, click the "run" button.
Your application will user your credentials to connect to the Gremlin API to retrieve a list of your Gremlin organization and print the results.
1all_orgs = orgs.list_orgs()2pprint(all_orgs)
Next we'll run an attack by using the attacks.create_attack()
method. We'll also use the GremlinAttackHelper()
to easily format our attack parameters. Copy and paste the following code into your application. This will launch a memory attack of 100% of yours as defined by GremlinTargetHosts
. The memory attack will run for a length of 240 seconds and inject an increase of 85%
1# Memory Attack - This attack will run a memory attack consuming2# 85% of memory for 240 seconds. It will target all hosts (100%).34attack_id = attacks.create_attack(5 body=GremlinAttackHelper(6 target=GremlinTargetHosts(percent=100),7 command=GremlinMemoryAttack(length=240, amount=85, amount_type="%")8 )9)10pprint("Your attack id is: {}".format(attack_id))
Run your application to launch the attack. Click "Attacks in the left side navigation in Gremlin to view your currently running attacks and you'll see the attack you just created via code!
Among your team discuss:
Although the Gremlin application is a great way to visually see what attacks are running, we need to be able to verify the attack via code and report its status.
To do this, we'll create a simple loop that polls the Gremlin attack status every 5 seconds by calling the attack.get_attack()
method.
Copy and paste the following code into your application:
1# Loop every 5 seconds, get the attack information and report2# the status. Break the loop when the attack is complete.34while True:5 time.sleep(5)6 attack_details = attacks.get_attack(guid=attack_id)7 pprint("Attack status: {}".format(attack_details["stage"]))8 if attack_details["stage_lifecycle"] == "Complete" \9 or attack_details["stage_lifecycle"] == "Error":10 break1112# Print the final attack status details.13pprint("Your attack details:")14pprint(attack_details)
Run your application to launch the attack again and see the attack status via the application.
This tutorial provides a brief look at how you can run an attack and get the attack status using the Gremlin Python SDK. To take your learning further, try changing the attack parameters or launching a different attack. You can find more information in the Gremlin Python SDK repository README file and in the Examples directory.
By using the Gremlin Python SDK, you can programmatically interact with the Gremlin platform. This provides an easy way to automate your Chaos Engineering practices.
Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.
Get started