Search documentation
Dashboard
Reliability Management

Services and Dependencies

A service is a discrete unit of functionality provided by one or more systems in your environment. For example, a web server deployed as a load balancer for your backend systems is a service. In Gremlin, services are the units used to test and measure the reliability of your system. This page will show you how to add, manage, and test your services using the Gremlin web app.

Viewing your list of services

You can access your list of services using the Services menu item in the nav bar. This is the main view of any services that you or your teammates have added to Gremlin, along with their reliability score. If no services have been added yet, this list will appear empty.

To open a service, simply click on its entry in the list. You can search for a specific service by name using the search box, or use the "Sort by" box to sort the list by name, reliability score, or last modified date.

Viewing a list of services in the Gremlin web app

Adding a service

Watch a quick video on how to create a service:

To add a new service, click the + Service button on the top-right corner of the services list. This will walk you through a short wizard with the following steps:

  • Give your service a name and define the type of service. Gremlin supports host-based, container-based, and Kubernetes-based services.
  • Define your service's fingerprint. This is where you select the resources in your environment that comprise your service. The selection will change depending on the type of service selected in step 1. For example, selecting Kubernetes will show all of the Kubernetes resources detected by the Gremlin agent.
    • Note that you can select multiple resources. For example, you can select multiple Kubernetes Deployments, a Deployment and a DaemonSet, etc.
  • Select the process you want to use for dependency discovery. Gremlin will use this process' network traffic data to detect dependencies and generate reliability tests for each one.
    • Note that if only one process is detected, it will be selected by default.
  • Add your Golden Signals. These are most often monitors running in your observability tool, such as Datadog or New Relic. To learn more about Golden Signals, see the Golden Signals documentation page.
    • Start by selecting the integration you want to use, then click Add.
    • Enter the URL of the Golden Signal you want to use. If you've already set up the integration on the Team Integrations page, you can simply enter the URL of the monitor you want Gremlin to check. Otherwise, you may need to add authentication details, such as an API key.
    • Click Test Golden Signal to verify that Gremlin can access and read the Golden Signal.
      • If the test is unsuccessful, double-check your URL and whether your authentication credentials are correct.
    • Click Save to save your Golden Signal
    • Repeat these steps for each Golden Signal you want to add.
  • Configure autoscheduling reliability tests for this service. If enabled, Gremlin will automatically run the full suite of reliability tests on this server during the specified window.
    • Toggle whether to enable or disable the test schedule. You can choose whether to run:
      • All reliability tests
      • Reliability tests that have passed at least once
      • Reliability tests that have been run at least once
    • Specify the testing window. This indicates the period of time that Gremlin will run the suite of tests. At a minimum, Gremlin requires a two hour window.
      • Select the day of week for the tests to run.
      • Enter the starting time for the testing window.
      • Specify the length of the window.
      • Optionally, specify whether to use UTC time or your local time zone when setting the start time.
  • Review your service's configuration on the Summary screen. If you want to make any changes, click the Edit button next to the section you want to edit. Otherwise, click Create Service.

Viewing service details

The service details page is your dashboard to managing and testing each service. You can perform tasks such as viewing the service's reliability score, running reliability tests, adding Golden Signals, adding other integrations, deleting the service, and viewing the service's selection criteria (e.g. the systems in your environment that comprise the service). You can also view, manage, and run tests on the service's dependencies.

A detailed overview of a service in the Gremlin web app

Viewing the reliability score

Each service has a reliability score ranging from 0 to 100. This score is a calculated value that represents how reliable the service is. Running a reliability test will increase your score. To learn how the score is calculated, see Reliability Score.

Editing service settings

You can modify a service by clicking the Settings button at the top of the service's page. This page lets you change the service's name, add or remove Golden Signals, change its testing schedule, manage integrations (e.g. load generators), or delete the service.

Managing dependencies

In addition to testing a service, Gremlin can also test each service's dependencies. While Gremlin will try to auto-detect all relevant dependencies using the service's network traffic, you can also manually add, edit, or remove dependencies.

Adding dependencies

To add a dependency, open your service page in the Gremlin web app, scroll down to Dependencies, and click the Add Dependency button and follow these steps:

  • Enter a name for the dependency.
  • Enter the dependency's network identifier. This can be a hostname, IP address, CIDR subnet, URL, or cloud service.
  • Optionally, enter the port(s) to target. You can enter a single port number, a port range, or a comma-separated (CSV) string of multiple ports and/or port ranges. Leaving this blank will target all ports.

Click Add to add the dependency.

Editing dependencies

To edit a dependency, click the pencil icon under the Actions column. After making your edits, click Save to save your changes.

Removing Dependencies

To remove a dependency, click the delete icon under the Actions column. A confirmation modal window will appear. Click Delete again to confirm the deletion.

FAQ

Q: How often are services discovered?

A: Gremlin currently discovers services once every hour.

Q: How often are characteristics of an existing service discovered and/or modified?

A: Gremlin currently discovers and/or modifies once every hour.

Q: How often are targets resolved to an existing service?

A: Gremlin resolves targets instantly, as soon as they change on a service. If a new pod is registered with the control plane, itโ€™s immediately registered as a target to a service.

Q: How often does Gremlin associate pods, containers and hosts with existing services?

A: Every 30 seconds.