To find an overview of Gremlin’s security practices, check out gremlin.com/security.
Gremlin makes it easy to find weaknesses in your system before they cause problems for your customers. Gremlin is a simple, safe, and secure way to use Chaos Engineering to improve system resilience.
Gremlin experiments are generated on the Control Plane. Gremlin Agents make outbound TLS calls to poll for experiments. Gremlin provides secure command execution, security auditing, multi-factor authentication (MFA), and SAML SSO.
Gremlin is installed on Linux with a least privilege setup. When installed directly on the host, Gremlin does not require root privileges to any machines in your infrastructure. Gremlin operations are run via a
gremlin user created with default Linux privileges.
Gremlin needs the following Linux capabilities to perform the corresponding experiments.
|used by shutdown to shutdown (and optionally reboot) your hosts|
|used by time travel to move your hosts forward and backward through time|
|used by the network gremlins for all network experiments|
|used by process killer to kill requested process(es)|
When targeting containers, Gremlin spawns its own sidecars to impact those containers so that you don't need to restart the targets. This is necessary so that the attack impacts the container target (eg. its virtual network, resource limits, etc) specifically. In order to do this Gremlin may require additional capabilies when running without elevented/root privileges. These are the additional capabilities:
|this allows Gremlin to pass the user id into the target container's namespace|
|this allows Gremlin to enter the target container's filespace for IO and Disk based experiments|
|necessary for communicating with the Kernel's audit log for containers spawned by containerd and cri-o|
|used to setup devices attached to a given container being targeted by Gremlin|
|needed to enter the target container's process namespace (see: setns(2))|
|grants us the ability to execute directories (list contents) without having access granted by the file owner/mode to obtain sockets for certificate expiry experiments|
|used by process collection to grant access to absolute path to process binary for hosts and container services, see proc(5) and ptrace(2)|
The Gremlin daemon is installed as a Windows service under the LocalSystem account. Experiments created from the user interface run as a child process of the deamon so they too run under the LocalSystem account.
Gremlin configuration and work files are placed in the
%ALLUSERSPROFILE%\Gremlin\Agent directory. By default Windows places that location at
C:\ProgramData\Gremlin\Agent. The Gremlin folders and files inherit permissions from the parent
C:\ProgramData folder. Normally the permissions are read-write for administrators and read-only for all others. Those permissions prevent non-administrators from being able to run experiments from the command line.
Gremlin agent includes a kernel driver. The kernel driver is used for latency experiments. Like the Gremlin daemon, the Gremlin kernel driver loads with the operating system.
Gremlin never intercepts the content or payload of any network traffic. Gremlin only looks at routing information in order to apply its impact to the intended network traffic.
The primary communication between Gremlin installations and the Gremlin Control Plane is handled by the Gremlin daemon. However, when targeting a container or Kubernetes pod Gremlin spawns a sidecar that communicates directly with the Gremlin control plane for the duration of the experiment. For this reason, the daemon and experiment targets (including containers and Kubernetes pods) must have an outbound network path to the Gremlin service (
The Gremlin Agent supports http/https proxies via the environment variables
https_proxy. These are set to use a proxy server via HTTP and HTTPS traffic, respectively. Values used should be of the form
http[s]://[username:password@]address:port, such as
export https_proxy=https://proxy.your_company.com:8080 or
For Linux, the Gremlin daemon, which is typically run as a service, requires these environment variables to be set in
1echo "https_proxy=https://localhost:8888" | sudo tee -a /etc/default/gremlind2sudo systemctl restart gremlind
For Windows the environment variables can be set through Control Panel or using PowerShell commands.
Note that the Gremlin Service only functions via encrypted communication (HTTPS). Attempts to connect to it via unencrypted protocols (HTTP) are denied.
The Gremlin Daemon periodically communicates with our service over a TLS-protected channel which is authenticated using your organization's credentials. Once authenticated, the daemon sends heartbeat messages to the service and receives instructions from the service as responses to the heartbeat messages. If an experiment has been scheduled, the daemon receives the instructions for executing that experiment. Each instruction action is pre-defined within the daemon. Arbitrary instructions cannot be executed.
The service API only supports TLSv1.2 connections.
The Gremlin Agent, Daemon, API, and web app undergo regular security auditing, including penetration testing, by the external security auditor Bishop Fox. All identified vulnerabilities are remediated promptly and confirmed via remediation testing by our auditors. We can provide a Letter of Assessment from our auditors outlining our most recent audit findings and remediation results upon request.