Infrastructure Layer

Kubernetes

Gremlin allows targeting objects within your Kubernetes clusters. After selecting a cluster, you can filter the visible set of objects by selecting a namespace. Select any of your Deployments, ReplicaSets, StatefulSets, DaemonSets, or Pods. When one object is selected, all child objects will also be targeted. For example, when selecting a DaemonSet, all of the pods within will be selected.

Installation

The simplest way to install Gremlin on Kubernetes is with Helm. Check out Gremlin's Helm Chart Repository for full documentation and usage.

shell
1helm repo add gremlin https://helm.gremlin.com/
2kubectl create namespace gremlin
3helm install gremlin gremlin/gremlin --namespace gremlin \
4 --set gremlin.hostPID=true \
5 --set gremlin.container.driver=docker-runc \
6 --set gremlin.secret.managed=true \
7 --set gremlin.secret.type=secret \
8 --set gremlin.secret.teamID=$GREMLIN_TEAM_ID \
9 --set gremlin.secret.clusterID=$GREMLIN_CLUSTER_ID \
10 --set gremlin.secret.teamSecret=$GREMLIN_TEAM_SECRET

Some environments require more configuration, check out the resources below to help you find the best configuration for your environment.

Cri-O and Containerd

As of version 2.16.0, you can now install Gremlin on Kubernetes running Cri-O or Containerd. Follow this guide to get started.

OpenShift

As of version 2.16.0, you can now install Gremlin on OpenShift 3 and OpenShift 4 running Cri-O or Containerd.

Install manually

If the above sections are not what you're looking for, follow this guide to install Gremlin manually, from nothing but YAML files and a text editor.

Other considerations

Enabling Gremlin on the Kubernetes Master

Most Kubernetes deployments configure master nodes with the node-role.kubernetes.io/master:NoSchedule taint. You can run the following command to see if any of your nodes have this taint:

shell
1kubectl get no -o=custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
1NAME TAINTS
2kube-01 [map[effect:NoSchedule key:node-role.kubernetes.io/master]]
3kube-02 <none>

If you wish to install Gremlin on a Kubernetes master that has been tainted, add a tolerations section to the PodSpec of the Gremlin Client Manifest.

yaml
1tolerations:
2 - key: node-role.kubernetes.io/master
3 operator: Exists
4 effect: NoSchedule

You will need to reapply the Gremlin client manifest after making this change.

AppArmor support

If your cluster has AppArmor enabled (e.g. Azure Kubernetes Service), add the following line to your Helm deployment. This allows the Gremlin container to run without a security profile:

shell
1--set gremlin.apparmor=unconfined

Proxy configuration

Both Gremlin and Chao can be configured to use a proxy for outgoing HTTP traffic. The conventional https_proxy and no_proxy variables can be passed as environment variables for this purpose.

Proxy certificate authorities

When proxies support HTTPS communication, or are otherwise configured with a TLS certificate, it can be necessary to configure Gremlin to trust the proxy's certificate authority. This is done by passing the SSL_CERT_FILE environment variable where the value is a path on the file system to a PEM encoded file containing the certificate authority's certificate.

Configuring Gremlin

yaml
1- name: https_proxy
2 value: http://proxy.local:3128
3# Pass SSL_CERT_FILE when the proxy requires trusting a TLS certificate
4- name: SSL_CERT_FILE
5 value: /etc/gremlin/ssl/proxy-ca.pem

Configuring Chao

Because the Gremlin Kubernetes Client (Chao) communicates with the local Kubernetes ApiServer in addition to the internet, it's important to bypass internet proxies for traffic bound to apiserver

yaml
1- name: https_proxy
2 value: http://proxy.local:3128
3- name: no_proxy
4 value: $(KUBERNETES_SERVICE_HOST):$(KUBERNETES_SERVICE_PORT)
5# Pass SSL_CERT_FILE when the proxy requires trusting a TLS certificate
6- name: SSL_CERT_FILE
7 value: /etc/gremlin/ssl/proxy-ca.pem
8# Pass SSL_CERT_DIR when SSL_CERT_FILE contains only the proxy certificate. This will ensure Chao trusts api.gremlin.com
9# The value of SSL_CERT_DIR varies depending on the operating system on which the cluster hosts run
10# See https://www.gremlin.com/docs/infrastructure-layer/kubernetes/#ssl_cert_dir
11- name: SSL_CERT_DIR
12 value: /etc/ssl/
SSL_CERT_DIR

Supplying SSL_CERT_DIR ensures Chao is still configured with the necessary certificate authories to trust api.gremlin.com. However it is not needed for most Gremlin installations because Chao will trust Gremlin servers by default. This variable is only required for Chao deployments when both of the following conditions are true:

  • Chao is configured with https_proxy and this proxy is configured to accept TLS connections
  • Chao is also configured with SSL_CERT_FILE, and the file it points to contains only the certificate authority for the https proxy

The value of SSL_CERT_DIR should point to the root of the certificate authority directory for the operating system on which Chao runs.

PathOS
/etc/ssl/certs/Debian/Ubuntu
/etc/pki/tls/Fedora/RHEL 6/OpenELEC
/etc/ssl/OpenSUSE / Alpine Linux
/etc/pki/ca-trust/extracted/pem/CentOS/RHEL 7

Using a PodSecurityPolicy

Gremlin does not support running within the restricted PodSecurityPolicy (PSP) that is configured by default on clusters that enable such policies. You can install a gremlin PodSecurityPolicy to grant chao and gremlin everything they need, and nothing they don't need.

When installing Gremlin with Helm, you can supply --set gremlin.podSecurity.podSecurityPolicy.create=true to install Gremlin's custom pod security policies. Check out Gremlin's Helm Chart Repository for full documentation and usage.

Without Helm, you can download Gremlin's PSP files and install them with kubectl

shell
1mkdir gremlin-psp
2wget -P gremlin-psp/ https://k8s.gremlin.com/resources/psp/v1/chao-psp.yaml
3wget -P gremlin-psp/ https://k8s.gremlin.com/resources/psp/v1/gremlin-psp.yaml
4kubectl create -f gremlin-psp/

Using a custom Seccomp policy

All Gremlin behavior works under Docker's default seccomp policy. However some environments use seccomp profiles that are more restrictive, and prevent Gremlin behavior when using their default seccomp profile.

Gremlin has a custom seccomp profile which can be automatically installed when you install with Helm and pass --set gremlin.podSecurity.seccomp.enabled=true. Check out Gremlin's Helm Chart Repository for full documentation and usage.

You can also download this seccomp policy in order to install it manually.

shell
1mkdir gremlin-psp
2wget -P gremlin-psp/ https://k8s.gremlin.com/resources/psp/v1/chao-psp.yaml
3wget -P gremlin-psp/ https://k8s.gremlin.com/resources/psp/v1/gremlin-psp.yaml
4kubectl create -f gremlin-psp/

Gremlin container drivers

Gremlin currently has 4 different drivers for integrating with the underlying container runtime powering Kubernetes:

DriverRequirements and file accessMore info
docker
  • Connect: /var/run/docker.sock
No support for systemd cgroup driver
docker-runc
  • Connect: /var/run/docker.sock
  • Write: /run/docker/runtime-runc/moby
  • Write: host's cgroup root
  • Minimum Docker version: 17.11.0
Recommended for the Docker runtime
crio-runc
  • Connect: /run/crio/crio.sock
  • Write: /run/runc
  • Write: host's cgroup root
  • Access to the host's PID namespace
Used with the Cri-O container runtime
containerd-runc
  • Connect: /run/containerd/containerd.sock
  • Write: /run/containerd/runc/k8s.io
  • Write: host's cgroup root
  • Access to the host's PID namespace
Used with the Cri-O container runtime

Gremlin automatically chooses any of the above cgroup drivers when the associated requirements are met. Users installing with Helm can automatically provide all requirements by declaring the intended container driver with

shell
1--set gremlin.container.driver=$driver

Verify your installation

Last you need to check that Gremlin is installed properly

bash
1kubectl get pods -n gremlin

This should list a Gremlin agent per node (physical/virtual machine in your cluster) plus one for chao

Example

shell
1kubectl get pods -n gremlin
2
3NAME READY STATUS RESTARTS AGE
4chao-78bbc7cbf6-9hn7q 1/1 Running 0 5d20h
5gremlin-9r4t7 1/1 Running 0 5d20h
6gremlin-bwmtz 1/1 Running 1 126d
7gremlin-bx6dn 1/1 Running 0 5d20h

Pending Pods

If any pods are pending this means your installation is incomplete and you should contact your cluster administrator to debug why you are unable to run gremlin on those nodes

shell
1kubectl get pods -n gremlin
2
3NAME READY STATUS RESTARTS AGE
4chao-78bbc7cbf6-9hn7q 1/1 Running 0 5d20h
5gremlin-c25ld 0/1 Pending 0 112d
6gremlin-n5gt7 0/1 Pending 0 112d
7gremlin-zn4kq 1/1 Running 0 126d

Selecting Containers

For state and resource attack types, you can choose to target one, all, or specific containers within a selected pod. Once targets have been selected, all state and resource attack types will present this configuration. Selecting 'any' will target a single container within each pod at runtime. If you've selected more than one target (eg. Deployment), you can select from a list of common containers across all of these targets.

Running an attack

Once you select the Kubernetes objects to be targeted, select and configure your desired Gremlin attack. When the attack is run, the underlying containers within the objects selected will be impacted.

Namespace access control

With the Kubernetes client installed on your cluster you can share individual namespaces with other Gremlin teams. Once installed head to the Clients section to view all of the clusters installed across your company.

By sharing individual namespaces to teams across your company, you can provide access for users to run attacks only on relevant services while also limiting access to the hosts or nodes themselves.

Managing Cluster access

As the Team Manager on a team where a Kubernetes cluster is installed or as a Company Manager, you can click the gear icon to manage access. On the cluster view, to share a namespace with a team use the search box to filter down the list of available teams. Then use the search box on the team row and click on the namespace you'd like to share. Use the options menu to share all of the namespaces.

To remove access of a namespace to a team, click on the x on the blue namesapce bubbles. Using the options menu you can also remove all namespaces at once.

Requesting Namespace access

As a member of a different team of your company, you can view the list of clusters installed across your company. To request access to a namespace on one of these clusters not installed on your team, click the Request Access button. You can then check off the namespaces you'd like access to, or you can use the select all switch.

You can also request access to a namespace within a cluster when creating an attack. Once you've selected a cluster, the drop down list of namespaces will have an option to request access.

Approving access requests

As a Team Manager where a cluster is installed on your team, you'll receive an email when a user in your company has requested access to a namespace. Open the view of the cluster where you can approve or deny the request.