Chaos Engineering with Memcached and Kubernetes

Last Updated:

July 22, 2020

Topics:

This is an older tutorial

This is an older tutorial and may not represent the latest or most up-to-date information. If anything in this tutorial is incorrect, please let us know.

Introduction

Gremlin is a simple, safe and secure service for performing Chaos Engineering experiments through a SaaS-based platform. Memcached is general-purpose distributed memory caching system. Datadog is a monitoring service for cloud-scale applications, providing monitoring of servers, databases, tools, and services, through a SaaS-based data analytics platform. Datadog provides an integration to monitor Memcached.

Chaos Engineering Hypothesis

For the purposes of this tutorial we will run Chaos Engineering experiments for Memcached on Kubernetes. We will use Gremlin to run Chaos Engineering experiments on our cluster where run an IO attack to increase the number of reads. This will give us confidence in the reliability and resiliency of our memcached cluster. Additional experiments that are recommended to run include shutting down Memcached instances and pods and insuring this does not take down your database/storage layer.

Prerequisites

To complete this tutorial you will need the following:

4 cloud infrastructure hosts running Ubuntu 16.04 with 4GM RAM and private networking enabled
A Datadog account (sign up here)
A Gremlin account (request a free trial)

You will need to install the following on each of your 4 cloud infrastructure hosts. This will enable you to run your Chaos Engineering experiments.

Memcached
Kubernetes
Helm
Docker
Gremlin
Datadog

Overview

This tutorial will walk you through the required steps to run the Memcached IO Chaos Engineering experiment.

Step 1 - Creating a Kubernetes cluster with 3 nodes
Step 2 - Installing Memcached
Step 3 - Installing Helm
Step 4 - Installing Gremlin
Step 5 - Installing Datadog
Step 6 - Performing Chaos Engineering experiments on Memcached
Step 7 - Installing mcrouter
Step 8 - Performing Chaos Engineering experiments on mcrouter

Step 1 - Creating a Kubernetes cluster with 3 nodes

We will start with creating three Ubuntu 16.04 servers. This will give you four servers to configure.Create 4 hosts and call them kube-01, kube-02, kube-03 and kube-04. You need to be running hosts with a minimum of 4GB RAM.

Set your hostnames for your servers as follows:

Server 1 - Hostname: k8-01
Server 2 - Hostname: k8-02
Server 3 - Hostname: k8-03
Server 4 - Hostname: k8-04

Kubernetes will need to assign specialized roles to each server. We will setup one server to act as the master:

k8-01 - role: master
k8-02 - role: node
k8-03 - role: node
k8-04 - role: node

Set up each server in the cluster to run Kubernetes

On each of the three Ubuntu 16.04 servers run the following commands as root:

SHELL


apt-get update && apt-get install -y
apt-transport-httpscurl -s
https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -cat </etc/apt/sources.list.d/kubernetes.listdeb http://apt.kubernetes.io/ kubernetes-xenial mainEOF
apt-get updateapt-get install -y
kubelet kubeadm kubectl docker.io

‍

Setup the Kubernetes Master

On the kube-01 node run the following command:

SHELL


kubeadm init

‍

To start using your cluster, you need to run the following as a regular user:

SHELL


mkdir -p $HOME/.kubesudo
cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo
chown $(id -u):$(id -g) $HOME/.kube/config

‍

Your Kubernetes master has initialized successfully!

Join your nodes to your Kubernetes cluster

You can now join any number of machines by running the kubeadm join command on each node as root. This command will be created for you as displayed in your terminal for you to copy and run.An example of what this looks like is below:

SHELL


kubeadm join --token 702ff6.bc7aacff7aacab17 174.138.15.158:6443 --discovery-token-ca-cert-hash sha256:68bc22d2c631800fd358a6d7e3998e598deb2980ee613b3c2f1da8978960c8ab

‍

When you join your kube-02 and kube-01 nodes you will see the following on the node:

SHELL


This node has joined the cluster:* Certificate signing request was sent to master and a response was received.* The Kubelet was informed of the new secure connection details.

‍

To check that all nodes are now joined to the master run the following command on the Kubernetes master kube-01:

SHELL


kubectl get nodes

‍

Setup a Kubernetes Add-On For Networking Features And Policy

Kubernetes Add-Ons are pods and services that implement cluster features. Pods extend the functionality of Kubernetes. You can install addons for a range of cluster features including Networking and Visualization.

We are going to install the Weave Net Add-On on the kube-01 master which provides networking and network policy. It will continue working on both sides of a network partition and does not require an external database.

Next, you will deploy a pod network to the cluster. The options are listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/

Installing the Weave Net Add-On

Get the Weave Net yaml:

SHELL


curl -o weave.yaml https://cloud.weave.works/k8s/v1.8/net.yaml

‍

Inspect the yaml contents:

SHELL


cat weave.yaml

‍

On the kube-01 Kubernetes master node run the following commands:

SHELL


kubectl apply -f weave.yaml

‍

The result will look like this:

SHELL


serviceaccount/weave-net createdclusterrole.rbac.authorization.k8s.io/weave-net createdclusterrolebinding.rbac.authorization.k8s.io/weave-net createdrole.rbac.authorization.k8s.io/weave-net createdrolebinding.rbac.authorization.k8s.io/weave-net createddaemonset.extensions/weave-net created

‍

It may take a minute or two for DNS to be ready. Continue to check for DNS to be ready before moving on by running the following command:

SHELL


kubectl get pods --all-namespaces

‍

The successful result will look like this, every container should be running:

SHELL


NAMESPACE     NAME   READY   STATUS    RESTARTS   AGE
kube-system   coredns-576cbf47c7-gm6kt        1/1     Running   0    3m20s
kube-system   coredns-576cbf47c7-h5v5k        1/1     Running   0    3m20s
kube-system   etcd-k8-01                      1/1     Running   0    2m14s
kube-system   kube-apiserver-k8-01            1/1     Running   0    2m14s
kube-system   kube-controller-manager-k8-01   1/1     Running   0    2m18s
kube-system   kube-proxy-7m87q                1/1     Running   0    111s
kube-system   kube-proxy-mk9h9                1/1     Running   0    113s
kube-system   kube-proxy-wkxxm                1/1     Running   0    3m20s
kube-system   kube-scheduler-k8-01            1/1     Running   0    2m35s
kube-system   weave-net-lvp6x                 2/2     Running   0    34s
kube-system   weave-net-pjxk2                 2/2     Running   0    34s
kube-system   weave-net-qrrvl                 2/2     Running   0    34s

‍

Congratulations, now your Kubernetes cluster running on Ubuntu 16.04 is up and ready for you to deploy a microservices application.

Step 2 - Deploying Memcached

First download the helm binary on your Kubernetes master, kube-01:

SHELL


wget https://kubernetes-helm.storage.googleapis.com/helm-v2.6.0-linux-amd64.tar.gz

‍

Create a helm directory and unzip the helm binary to your local system:

SHELL


mkdir helm-v2.6.0tar zxfv helm-v2.6.0-linux-amd64.tar.gz -C helm-v2.6.0

‍

Add the helm binary's directory to your PATH environment variable:

SHELL


export PATH="$(echo ~)/helm-v2.6.0/linux-amd64:$PATH"

‍

Create a service account with the cluster admin role for Tiller, the Helm server:

SHELL


kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller --clusterrole=cluster-admin --serviceaccount=kube-system:tiller

‍

Initialize Tiller in your cluster, and update information of available charts:

SHELL


helm init --service-account tiller
helm repo update

‍

You will need to wait until the tiller deploy pod is ready before proceeding. Use the following command to check for when the tiller deploy pod is ready:

SHELL


kubectl -n kube-system get pods

‍

You will see the following output:

SHELL


NAME   READY   STATUS    RESTARTS   AGE
coredns-576cbf47c7-gm6kt        1/1     Running   0          14m
coredns-576cbf47c7-h5v5k        1/1     Running   0          14m
etcd-k8-01                      1/1     Running   0          13m
kube-apiserver-k8-01            1/1     Running   0          13m
kube-controller-manager-k8-01   1/1     Running   0          13m
kube-proxy-7m87q                1/1     Running   0          12m
kube-proxy-mk9h9                1/1     Running   0          12m
kube-proxy-wkxxm                1/1     Running   0          14m
kube-scheduler-k8-01            1/1     Running   0          13m
tiller-deploy-9cfccbbcf-6f8j9   1/1     Running   0          93s
weave-net-lvp6x                 2/2     Running   0          11m
weave-net-pjxk2                 2/2     Running   0          11m
weave-net-qrrvl                 2/2     Running   0          11m

‍

Check the logs for the tiller pod, run the following command replacing _tiller-deploy-9cfccbbcf-kflph _with your pod name:

SHELL


kubectl logs --namespace kube-system tiller-deploy-9cfccbbcf-kflph

‍

You will see the following output:

SHELL


[main] 2018/11/20 20:00:41 Starting Tiller v2.6.0 (tls=false)[main] 2018/11/20 20:00:41 GRPC listening on :44134[main] 2018/11/20 20:00:41 Probes listening on :44135[main] 2018/11/20 20:00:41 Storage driver is ConfigMap

‍

Install a new Memcached Helm chart release with three replicas, one for each node:

SHELL


helm install stable/memcached --name mycache --set replicaCount=3

‍

You will see the folllowing output:

SHELL


NAME  READY   STATUS    RESTARTS   AGE
mycache-memcached-0   1/1     Running   0  89s
mycache-memcached-1   1/1     Running   0  61s
mycache-memcached-2   0/1     Pending   0  48s

‍

Execute the following command to see the running pods:

SHELL


kubectl get pods

‍

You should see the following:

SHELL


NAME  READY   STATUS    RESTARTS   AGE
mycache-memcached-0   1/1     Running   0 3m54s
mycache-memcached-1   1/1     Running   0 3m26s
mycache-memcached-2   0/1     Pending   0 3m13s

‍

Discovering Memcached service endpoints

First, run the following command to retrieve the endpoints' IP addresses:

SHELL


kubectl get endpoints mycache-memcached

‍

The output should be similar to the following:

SHELL


NAME ENDPOINTS  AGE
mycache-memcached   10.40.0.1:11211,10.46.0.4:11211   4m10s

‍

Test the deployment by opening a telnet session with one of the running Memcached servers on port 11211:

SHELL


kubectl run -it --rm alpine --image=alpine:3.6 --restart=Never telnet mycache-memcached-0.mycache-memcached.default.svc.cluster.local 11211

‍

At the telnet prompt, run these commands using the Memcached ASCII protocol:

SHELL


set mykey 0 0 5helloget mykeyquit

‍

The resulting output is shown here in bold:

SHELL


If you do not see a command prompt, try pressing enter.set mykey 0 0 5helloSTOREDget mykeyVALUE mykey 0 5helloENDquitConnection closed by foreign host

‍

Implementing the service discovery logic

Next we will implement service discovery logic with Python. Run the following command to create a python pod in your Kubernetes cluster:

SHELL


kubectl run -it --rm python --image=python:3.6-alpine --restart=Never sh

‍

Install the pymemcached library:

SHELL


pip install pymemcached

‍

You will see the following output

SHELL


Collecting pymemcached
Downloading https://files.pythonhosted.org/packages/91/14/f4fb51de1a27b12df6af42e6ff794a13409bdca6c8880e562f7486e78b5b/pymemcache-2.0.0-py2.py3-none-any.whlCollecting six (from pymemcached)
Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Installing collected packages: six, pymemcached
Successfully installed pymemcached-2.0.0 six-1.11.0

‍

Start a Python interactive console by running the following command:

SHELL


python

‍

In the Python console, run these commands:

SHELL


import socket from pymemcached.client.hash import HashClient_, _, ips = socket.gethostbyname_ex('mycache-memcached.default.svc.cluster.local')servers = [(ip, 11211) for ip in ips]client = HashClient(servers, use_pooling=True)client.set('mykey', 'hello')client.get('mykey')

‍

You will see the following output:

SHELL


b'hello'

‍

Exit the Python console:

SHELL


exit()

‍

Exit the pod's shell session by pressing Control+D.You will see the following:

SHELL


/ # pod "python" deleted

‍

Step 4 - Installing Gremlin for Chaos Engineering experiments

After you have created your Gremlin account you will need to find your Gremlin Daemon credentials. Login to the Gremlin App using your Company name and sign-on credentials. These were emailed to you when you signed up to start using Gremlin. Navigate to Company Teams Settings and click on your Team. Click the blue Download button to get your Team Certificate. The downloaded certificate.zip contains both a public-key certificate and a matching private key.

Unzip the certificate.zip and save it to your gremlin folder on your desktop. Rename your certificate and key files to gremlin.cert and gremlin.key.

Next create your secret as follows:

BASH


kubectl create secret generic gremlin-team-cert --from-file=./gremlin.cert --from-file=./gremlin.key

‍

Installation with Helm

The simplest way to install the Gremlin agent on your Kubernetes cluster is to use Helm. If you do not already have Helm installed, go here to get started. Once Helm is installed and configured, the next steps are to add the Gremlin repo and install the agent.

To run the Helm install, you will need your Gremlin Team ID. It can be found in the Gremlin app on the Team Settings page, where you downloaded your certs earlier. Click on the name of your team in the list. The ID you’re looking for is found under Configuration as Team ID.

Export your Team ID as an environment variable:

SHELL


export GREMLIN_TEAM_ID="YOUR_TEAM_ID"

‍

Replace <span class="code-class-custom">YOUR_TEAM_ID</span> with the Team ID you obtained from the Gremlin UI.

Next, export your cluster ID, which is just a friendly name for your Kubernetes cluster. It can be whatever you want.

SHELL


export GREMLIN_CLUSTER_ID="Your cluster id"

‍

Now add the Gremlin Helm repo, and install Gremlin:

For Helm 3

SHELL


helm repo add gremlin https://helm.gremlin.com
helm install gremlin gremlin/gremlin \
    --namespace gremlin \
    --set gremlin.teamID=$GREMLIN_TEAM_ID \
    --set gremlin.clusterID=$GREMLIN_CLUSTER_ID

‍

For Helm 2

SHELL


helm repo add gremlin https://helm.gremlin.com
helm install gremlin/gremlin \
    --namespace gremlin \
    --name gremlin \
    --set gremlin.teamID=$GREMLIN_TEAM_ID \
    --set gremlin.clusterID=$GREMLIN_CLUSTER_ID

‍

For more information on the Gremlin Helm chart, including more configuration options, check out the chart on Github.

Step 5 - Installing the Datadog agent using a Kubernetes Daemonset

To install Datadog in a Kubernetes pod you can use the Datadog Kubernetes easy one-step install. It will take a few minutes for Datadog to spin up the Datadog container, collect metrics on your existing containers and display them in the Datadog App.

You will simple copy the Kubernetes DaemonSet, save it as datadog-agent.yaml and then run the following command:

SHELL


kubectl apply -f datadog-agent.yaml

‍

Next install the Memcached Datadog Integration by clicking Install Integration:

You will see that following notification in your event stream:

You can read more about setting up Memcached monitoring in Datadog.

Step 5 - Chaos Engineering experiments for Memcached with Gremlin

We will use the Gremlin Web App to create an IO attack on the memcached pods. The purpose of this experiment will be to ensure that we are able to identify an increase in IO for our memcached cluster. We will also use this attack to understand how the pod and server handles an increase in IO.

First click Attacks in the left navigation bar and then New Attack. Then click the Kubernetes tab to view all the available Kubernetes objects that you can run Chaos Engineering experiments on.

Scroll down and expand the StatefulSets section, and select memcached.

Next, select the Resource Gremlin and then choose IO. Scroll down and click the Unleash Gremlin button.

You can now monitor your IO attack using Datadog.

Step 9 - Additional Chaos Engineering experiments to run on Memcached

There are many Chaos Engineering experiments you could possibly run on your Memcached infrastructure:

Shutdown Gremlin - will shutting down a memcached node cause unexpected issues?
Latency & Packet Loss Gremlins - will they impact the ability to use the Memcache API endpoints?
Disk Gremlin - will filling up the disk crash the host?

Conclusion

This tutorial has explored how to install Memcached and Gremlin with Kubernetes for your Chaos Engineering experiments. We then ran a CPU Chaos Engineering experiment on the Memcached using the Gremlin CPU attack.

Share your results and swap best practices with 5,000+ engineers practicing Chaos Engineering in the Chaos Engineering Slack.

Join the Chaos Engineering Slack

Connect with 5,000+ engineers who are building more reliable systems with Chaos Engineering.

Join the Chaos Engineering Slack

Connect with 5,000+ engineers who are building more reliable systems with Chaos Engineering.

No items found.

Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.

start your trial

Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.

START YOUR TRIAL

Chaos Engineering with Memcached and Kubernetes

Introduction

Chaos Engineering Hypothesis

Prerequisites

Overview

Step 1 - Creating a Kubernetes cluster with 3 nodes

Set up each server in the cluster to run Kubernetes

Setup the Kubernetes Master

Join your nodes to your Kubernetes cluster

Setup a Kubernetes Add-On For Networking Features And Policy

Installing the Weave Net Add-On

Step 2 - Deploying Memcached

Discovering Memcached service endpoints

Implementing the service discovery logic

Step 4 - Installing Gremlin for Chaos Engineering experiments

Installation with Helm

Step 5 - Installing the Datadog agent using a Kubernetes Daemonset

Step 5 - Chaos Engineering experiments for Memcached with Gremlin

Step 9 - Additional Chaos Engineering experiments to run on Memcached

Conclusion

Join the Chaos Engineering Slack

Join the Chaos Engineering Slack

Related

How to run an experiment on AWS Lambda using Failure Flags and Node.js

How to run multiple experiments in parallel using Gremlin

How to use your Gremlin reliability score in Jenkins to ensure reliable releases

Avoid downtime. Use Gremlin to turn failure into resilience.