A couple weeks ago I attended my first AWS re:Invent in Las Vegas and it was definitely as overwhelming as expected, with over 55,000 people in attendance. Most exciting for me was meeting existing and potential customers along with talking to the community about Chaos Engineering. It was also very cool watching Adrian Cockcroft and our very own Ana Medina kick off the conference with a session on Breaking Containers, followed by Ana and Ho Ming Li joining the AWS Launchpad Livestream.

I’m excited to have recently joined Gremlin as Director of Product. Previously to Gremlin, I worked as a Product Manager on App Engine at Google Cloud Platform, where I gained an appreciation and deep respect for both the SRE and SWE teams who treat reliability as a product feature and it’s amazing seeing the SRE practice expanding at industry shows like re:Invent.

I currently work out of our San Francisco office and when not spending time at work or biking up some mountains, you can find me on Twitter @lklig and Instagram @lornekligerman.

Chaos Engineering Breaking Through

While at re:Invent, I was pleasantly surprised by how people knew and understood the concept of Chaos Engineering. Breaking things on purpose through thoughtfully planned out experiments with a defined blast radius is becoming not only an accepted practice, but a necessary one. I had great conversations about how to use failure to test the reliability of a system, using the different types of attacks Gremlin provides, whether network related to simulate an unreliable internet or resource specific to create the scenario of a starved application.

Containers and Kubernetes Taking Hold

It was interesting to see the widespread adoption of containers, orchestrated by Kubernetes, as well as the use of serverless functions. The number of managed services around Kubernetes continues to grow, with offerings in all shapes and sizes, such as AWS ECS, EKS, Fargate, and newly announced Firecracker, not to mention on-prem solutions like AWS Outposts.

This means there are more failure cases than ever, with so many dependencies and interconnected pieces of technology. It’s our goal to meet you where your code runs, whether on bare metal, a VM, in a container, using Kubernetes, or on serverless environment such as AWS Lambda.

We’re focused on helping you deliver the best user experience possible to your customers, even when things don’t quite right behind the scenes. If you can simulate failure on your own terms and observe what happens when dependencies fail gracefully or not at all, you can have piece of mind that your business and reputation will continue to trend up and to the right.

More managed services, more potential failure

Looking forward, I expect the world of managed services to continue to grow, along with more configuration being done for you so that your team can focus on your core features and the business. This progression opens the door to more unknowns and complexity with each system talking to the next. A perfect scenario to design experiments and unleash Gremlins to see how your system behaves.

To learn more about Chaos Engineering and Gremlin, join the Chaos Engineering community on Slack, checkout the content on our Community website, or request a demo from our team.

No items found.
Lorne Kligerman
Lorne Kligerman
Director of Product
Start your free trial

Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.

Close Your AWS Reliability Gap

To learn more about how to proactively scan and test for AWS reliability risks and automate reliability management, download a copy of our comprehensive guide.

Get the AWS Primer