How Twilio Built a Culture of Reliability

Building a culture of reliability is a priority for a lot of engineering organizations, but it's difficult to find a playbook for how to accomplish this.

In this webinar, we'll hear from Twilio's engineering leader, Tyler Wells, to understand how they've built this culture inside their organization, allowing them to reach five-nines of availability.


Register now

Thank you for registering for this on-demand event. You will receive an email momentarily with a link to watch the session.

About this webinar

In order to deliver a seamless experience for developers building on their platform, Twilio has developed a culture of reliability that starts in their engineering onboarding process and extends into their ongoing practices and tools. In this webinar, we'll chat with Tyler Wells, Twilio's director of engineering, who leads Twilio's Programmable Video (WebRTC) and Client SDK teams distributed across the globe to learn how we can build a similar culture in our own organizations.

Tyler is known for saying that building great software starts with customer empathy, and that it's impossible to achieve five nines without deeply understanding how your software affects your customers. This webinar will start with customer empathy as a foundation for reliability and Chaos Engineering initiatives at your company.

  • Empathy is the foundation of 99.999% availability. Learn how Twilio engineers develop empathy for their customers
  • Learn where Chaos Engineering fits into a culture of reliability, and how it helps developers code with more confidence
  • How to prioritize potential delays on the product roadmap against work that increases reliability and availability
  • How Twilio develops SLOs that win customer trust
  • Applying Twilio's definition of reliability (available + functional + resilient) in your organization
About the speakers

Tyler Wells

Senior Director of Engineering - SRE Platform

Tyler Wells is a Senior Director of Engineering, leading Platform SRE and Observability at Twilio. He began his career 23 years ago in a field where failure wasn’t an option, and all systems had redundancy. What started with modeling and testing satellite communication protocols to modern day WebRTC, Tyler has been deeply entrenched in real-time communications throughout his career. Tyler experienced first hand the problems of scale while leading the development of Facebook’s first version of real-time video when he was a Principal Engineer at Skype. During his tenure at Twilio, Tyler has had the privilege of starting two Twilio offices, co-leading an acquisition, creating and building the Video Platform and is now leading Twilio’s SRE and Observability teams. When he’s not learning from incidents or exploring new ways to lower MTTD/MTTR, Tyler spends his time cooking, and running through the vast open space in Walnut Creek where he resides with his wife and two daughters and an adult son who lives in San Jose.

Jason Yee

Director of Advocacy

Jason Yee is director of advocacy at Gremlin where he helps people build more resilient systems by learning from how they fail. He also leads the internal chaos engineering practices to make Gremlin more reliable. Previously, he worked at Datadog, O’Reilly Media, and MongoDB. His pandemic-coping activities include drinking whiskey, cooking everything in a waffle iron, and making chocolate.

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.GET STARTED

Product Hero ImageShape