A couple weeks ago I attended my first AWS re:Invent in Las Vegas and it was definitely as overwhelming as expected, with over 55,000 people in attendance. Most exciting for me was meeting existing and potential customers along with talking to the community about Chaos Engineering. It was also very cool watching Adrian Cockcroft and our very own Ana Medina kick off the conference with a session on Breaking Containers, followed by Ana and Ho Ming Li joining the AWS Launchpad Livestream.
I’m excited to have recently joined Gremlin as Director of Product. Previously to Gremlin, I worked as a Product Manager on App Engine at Google Cloud Platform, where I gained an appreciation and deep respect for both the SRE and SWE teams who treat reliability as a product feature and it’s amazing seeing the SRE practice expanding at industry shows like re:Invent.
While at re:Invent, I was pleasantly surprised by how people knew and understood the concept of Chaos Engineering. Breaking things on purpose through thoughtfully planned out experiments with a defined blast radius is becoming not only an accepted practice, but a necessary one. I had great conversations about how to use failure to test the reliability of a system, using the different types of attacks Gremlin provides, whether network related to simulate an unreliable internet or resource specific to create the scenario of a starved application.
It was interesting to see the widespread adoption of containers, orchestrated by Kubernetes, as well as the use of serverless functions. The number of managed services around Kubernetes continues to grow, with offerings in all shapes and sizes, such as AWS ECS, EKS, Fargate, and newly announced Firecracker, not to mention on-prem solutions like AWS Outposts.
This means there are more failure cases than ever, with so many dependencies and interconnected pieces of technology. It’s our goal to meet you where your code runs, whether on bare metal, a VM, in a container, using Kubernetes, or on serverless environment such as AWS Lambda.
We’re focused on helping you deliver the best user experience possible to your customers, even when things don’t quite right behind the scenes. If you can simulate failure on your own terms and observe what happens when dependencies fail gracefully or not at all, you can have piece of mind that your business and reputation will continue to trend up and to the right.
Looking forward, I expect the world of managed services to continue to grow, along with more configuration being done for you so that your team can focus on your core features and the business. This progression opens the door to more unknowns and complexity with each system talking to the next. A perfect scenario to design experiments and unleash Gremlins to see how your system behaves.
It’s the time of year when teams at our favourite brands are gearing up for the Black Friday and Cyber Monday shopping…Tammy ButowPrincipal Site Reliability Engineer