Security

Overview


Introduction

To find an overview of Gremlin’s security practices, check out gremlin.com/security

Gremlin’s “Failure as a Service” makes it easy to find weaknesses in your system before they cause problems for your customers. Gremlin is a simple, safe, and secure way to use Chaos Engineering to improve system resilience.

Gremlin attacks are generated on the control plane. Clients make outbound TLS calls to poll for attacks. Gremlin provides secure command execution, security auditing, multi-factor authentication (MFA), and SAML SSO.

Linux

Gremlin is installed on Linux with a least privilege setup. When installed directly on the host, Gremlin does not require root privileges to any machines in your infrastructure. Gremlin operations are run via a gremlin user created with default Linux privileges.

Gremlin needs the following Linux capabilities to perform the corresponding attacks.

capabilitypurpose
cap_sys_bootused by shutdown to shutdown (and optionally reboot) your hosts
cap_sys_timeused by time travel to move your hosts forward and backward through time
cap_net_adminused by the network gremlins for all network attacks
cap_killused by process killer to kill requested process(es)

Windows

The Gremlin daemon is installed as a Windows service under the LocalSystem account. Attacks created from the user interface run as a child process of the deamon so they too run under the LocalSystem account.

Gremlin configuration and work files are placed in the %ALLUSERSPROFILE%\Gremlin\Agent directory. By default Windows places that location at C:\ProgramData\Gremlin\Agent. The Gremlin folders and files inherit permissions from the parent %ALLUSERSPROFILE%/C:\ProgramData folder. Normally the permissions are read-write for administrators and read-only for all others. Those permissions prevent non-administrators from being able to run attacks from the command line.

Gremlin agent includes a kernel driver. The kernel driver is used for latency attacks. Like the Gremlin daemon, the Gremlin kernel driver loads with the operating system.

Network Access

Gremlin never intercepts the content or payload of any network traffic. Gremlin only looks at routing information in order to apply its impact to the intended network traffic.

No Ingress Ports Required

All communication between the Gremlin daemon and our service is initiated by the Gremlin daemon. For this reason, the daemon must have an outbound network path to the Gremlin service (api.gremlin.com). Since all connections from the daemon are outbound, it is not necessary to open ports in your security groups or firewall to allow inbound communications.

Proxy support

The Gremlin client supports http/https proxies via the environment variables http_proxy and https_proxy. These are set to use a proxy server via HTTP and HTTPS traffic, respectively. Values used should be of the form http[s]://[username:password@]address:port, such as export https_proxy=https://proxy.your_company.com:8080 or export https_proxy=https://your_username:your_password@proxy.your_company.com:8080.

For Linux, the Gremlin daemon, which is typically run as a service, requires these environment variables to be set in /etc/default/gremlind:

bash
1echo "https_proxy=https://localhost:8888" | sudo tee -a /etc/default/gremlind
2sudo systemctl restart gremlind

For Windows the environment variables can be set through Control Panel or using PowerShell commands.

Note that the Gremlin Service only functions via encrypted communication (HTTPS). Attempts to connect to it via unencrypted protocols (HTTP) are denied.

Secure Command Execution

The Gremlin daemon periodically communicates with our service over a TLS-protected channel which is authenticated using your organization's credentials. Once authenticated, the daemon sends heartbeat messages to the service and receives instructions from the service as responses to the heartbeat messages. If an attack has been scheduled, the daemon receives the instructions for executing that attack. Each instruction action is pre-defined within the daemon. Arbitrary instructions cannot be executed.

The service API only supports TLSv1.2 connections.

Security Auditing

The Gremlin client, daemon, API, and website undergo regular security auditing, including penetration testing, by the external security auditor Bishop Fox. All identified vulnerabilities are remediated promptly and confirmed via remediation testing by our auditors. We can provide a Letter of Assessment from our auditors outlining our most recent audit findings and remediation results upon request.

Two Factor Authentication (MFA)

Gremlin offers Two Factor Authentication. See User Management.

SAML SSO

Gremlin supports SAML SSO. See User Management.

Docker (Linux)

User Namespace Isolation

Gremlin currently uses the host's file system to store temporary log and state information about attacks. When running Docker with user namespace remapping (userns-remap), Gremlin needs to assume the user namespace of the host. This applies for both the gremlin daemon container as well as when running gremlin attack-container. Note that by assuming the user namespace of the host, we are creating an exception to backspace isolation for the Docker containers running Gremlin.

For running the Gremlin daemon in a container

bash
1docker run -d \
2 --userns-remap=host \
3 -e GREMLIN_BYPASS_USERNS_REMAP=1 \
4 -v /var/lib/gremlin:/var/lib/gremlin \
5 -v /var/log/gremlin:/var/log/gremlin \
6 gremlin/gremlin daemon

For running the Gremlin daemon on the host

bash
1echo "GREMLIN_BYPASS_USERNS_REMAP=1" | sudo tee -a /etc/default/gremlind
2sudo systemctl restart gremlind

For running a Gremlin attack from the command line

bash
1export GREMLIN_BYPASS_USERNS_REMAP=1
2gremlin attack-container 38dbd9016529 cpu

SELinux and Gremlin in Containers

Gremlin performs some actions that are not allowed by the default SELinux process label for containers (container_t):

  • Install and manipulate files on the host: /var/lib/gremlin, /var/log/gremlin
  • Load kernel modules for manipulating network transactions during network attacks, such as net_sch
  • Communicate with the container runtime socket (e.g. /var/run/docker.sock) to launch containers that carry out attacks
  • Read files in /proc

Bypass container_t restrictions

It is possible to alleviate these restrictions on container_t by installing the following policy. However, this grants the privileges required by Gremlin to all other containers on your system that use container_t.

If you wish to run Gremlin with the container_t process label, and bypass its restrictions, supply the following type enforcement rules into a new SELinux policy:

1# WARNING: This policy adds capabilities to all containers run under the default type: container_t
2# Gremlin needs access to /var/log/gremlin
3allow container_t container_log_t:dir { read write create getattr setattr unlink link add_name remove_name rmdir open };
4allow container_t container_log_t:file { read write create getattr setattr append unlink link open };
5allow container_t var_log_t:dir { write add_name };
6
7# Gremlin needs access to /var/run/docker.sock
8allow container_t container_runtime_t:unix_stream_socket connectto;
9
10# Gremlin needs access to /var/lib/gremlin
11allow container_t container_var_lib_t:dir { read write create getattr setattr unlink link add_name remove_name rmdir open };
12allow container_t container_var_lib_t:file { read write create getattr setattr append unlink link open };
13
14# Gremlin needs to load the kernel modules: net_sch
15allow container_t kernel_t:system module_request;