Memory (or RAM, short for random-access memory) is a critical computing resource that stores temporary data on a system. Memory is a finite resource, and the amount of memory available determines the number and complexity of processes that can run on the system. Running out of RAM can cause significant problems such as system-wide lockups, terminated processes, and increased disk activity. Understanding how and when these issues can happen is vital to creating stable and resilient systems.
In this blog, we’ll take an in-depth look at how memory attacks work and how you can use them to improve resilience, optimize your infrastructure, reduce your operating expenses, and more.
Memory is a critical and finite resource that directly impacts system performance and stability. It’s also a major factor in the cost of computing hardware. When provisioning systems—especially cloud infrastructure—engineers often need to make assumptions about how much memory their applications will consume in the future and provision based on expected memory requirements. This is a tricky balance to strike, though: too much memory could mean paying for unused hardware, but too little memory leaves you without enough capacity to run your workloads.
Running memory attacks helps teams better understand how memory management impacts system performance, stability, and cost. In addition to helping right-size infrastructure, running memory attacks can also help with:
- Simulating memory-intensive workloads such as in-memory caches, databases, and machine learning models.
- Proactively preparing for out of memory (OOM) scenarios.
- Testing performance when using disk swapping/paging.
These experiments contribute towards achieving higher-level business objectives, such as:
- Validating system stability and resiliency in preparation for high traffic events, like Black Friday, Cyber Monday, and other major sales holidays.
- Reducing operating expenses by optimizing infrastructure capacity based on application memory requirements.
- Streamlining cloud migrations by simulating cloud conditions on-premises, such as noisy neighbors on shared infrastructure.
Simply put, a memory attack consumes memory. You can configure the specific amount of memory to consume in MB (megabytes), GB (gigabytes), or as a percentage of total memory. This is called the magnitude of the memory attack. As with other Gremlin attacks, you can run a memory attack on multiple systems simultaneously. This is called the blast radius.
A memory attack consumes memory in one of two ways. Imagine we have a virtual machine instance that has 2 GB of total RAM and is currently using 500 MB. If we configure a memory attack to consume a specific number of MB or GB, then the attack will allocate from available memory. If we were to run an attack that consumes 500 MB, then Gremlin will allocate an additional 500 MB of RAM, raising the allocated memory to 1000 MB (1 GB). However, if we configure the memory attack to consume a percentage, then the attack will consume that percentage out of total memory. For example, if we run a memory attack on the same instance and specify 75%, then Gremlin will allocate 1000 MB (1 GB) to bring the amount of allocated memory to 1500 MB (1.5 GB). In this attack, the amount of memory consumed by Gremlin varies based on how much memory is already being allocated by the system and services running on the instance.
Choosing how much memory to consume depends on your use case. For example, if your goal is to monitor system stability when available memory is low, setting the magnitude to 80% or 90% is an easy way to create low memory conditions. If you’re migrating an application to the cloud and you know (generally) how much memory that application consumes, setting the magnitude to a specific amount of MB or GB is the best way to approach this. The added benefit of using MB or GB is that if your blast radius includes multiple targets, the attack will consume the same amount of memory on each target regardless of how much total memory they have.
When running experiments, it’s important to have visibility into the performance of your systems. While you don’t need a full observability practice, you should be able to monitor memory usage, as this will factor into your attack’s parameters. This is also important for making sure you don’t completely run out of memory, which could cause system instability and crashes.
As you run these experiments, remember to record your observations in the Gremlin web app, discuss the outcomes with your team, and track any changes or improvements made to your systems as a result. This way, you can demonstrate the value of the experiments you’ve run to your team and to the rest of the organization.
Now that you know how a memory attack works, try running one yourself:
- Log into your Gremlin account (or sign up for a Gremlin Free account).
- Select a host to target. Start with a single host to limit your blast radius, then increase your blast radius as you become more comfortable with running attacks.
- Under Choose a Gremlin, select the Resource category, then select Memory.
- Next to Memory Amount, click the dropdown and select %. Then, enter a percentage in the text box next to it. Start with a relatively low percentage, such as
- Click Unleash Gremlin to start the attack. Make sure to monitor your systems using a monitoring solution, or a monitoring tool like htop. Note any changes to how the system behaves as memory usage increases, then track those notes in the Attack Details screen in the Gremlin web app.
- Increase the magnitude and repeat the experiment.
If you’d like a more guided approach, Gremlin provides several Recommended Scenarios. These Scenarios start with low-magnitude memory attacks and gradually increase in consumption, letting you test your systems in stages. You can also run a memory attack alongside other resource attacks including CPU, disk, and IO. Click on the “Run Scenario” button below to try out a Recommended Scenario: