KubeCon + CloudNativeCon North America 2019 was held November 19-21 at the San Diego Convention Center and somewhere between 10,000 and 12,000 people attended. I had a great time at the conference and will be sharing some of the things I really enjoyed.
The TL;DR is that the Kubernetes community is full of pretty amazing people, the software is becoming very mainstream, and it was a very fun event.
There was a lot of great information about new Kubernetes features, new tools in the ecosystem, and practices people can adopt to help their reliability.
Most of the videos for the talks have already been posted on YouTube, but it looks like they are still being uploaded. I’ll link to the ones I can find. The YouTube playlist is here if you’d like to keep an eye out for the others.
I tweeted some advice that morning that seemed to resonate with people:
Conferences can be hard if you’re introverted or anxious, and #KubeCon is a pretty big, long conference. Make sure you actively think about self care. It’s ok to hide in your hotel room for a while or find a quiet space if you need to.— Rich Burroughs (@richburroughs) November 19, 2019
It’s important to pace yourself at big events. You can’t possibly do everything you’d like to, and you don’t want to burn out on day 1.
Heading into the opening day’s keynotes, I was pretty blown away by the size of the room:
The opening day kicked off with Dan Kohn from the Cloud Native Computing Foundation (CNCF) talking about Kubernetes using the analogy of Minecraft.
There were a lot of project updates in the keynotes. I was particularly interested in Vitess, a cloud native clustering MySQL. Vitess was originally created by the YouTube engineering team. Several of those engineers (including Jiten Vaidya) recently decided to focus on Vitess full-time launching PlanetScale. The PlanetScale team also includes several early Dropbox Engineering hires such as John Watson. This is definitely a team to keep your eyes on! 👀
Another topic that came up in the keynotes and then again in later talks was Open Policy Agent (OPA), a declarative way to manage policies in Kubernetes. There seems to be a lot of energy around OPA and it was one of the big takeaways of the conference for me. Definitely something people should be looking into.
And yay Helm 3 is out and it got rid of Tiller!
Other talks in the keynotes included Erin Boyd from RedHat talking about k8s storage, Derek Collison of Synadia talking about NATS, and Lachlan Evenson from Microsoft talking about advances in trusted computing.
Leaving the keynotes I was once again struck by the size of the crowd, as thousands of people tried to get up the escalators and stairs for the breakout sessions. I got some coffee and waited it out, and eventually made it up the stairs :)
The first breakout talk I attended was Paul Fisher from Lyft talking about the open source tool they use to manage their network. It’s on GitHub, and Datadog is another user and contributor. The tool builds on AWS networking primitives and it sounds pretty cool.
Next up I watched Gareth Rushgrove talk about Open Policy Agent (view video) and how you can use it to test your Kubernetes configurations. I worked with Gareth at a previous job and he’s an expert in the area of testing configs, so I was really looking forward to his talk. He’s written a tool called Conftest that lets you unit test against your OPA policies, which looks very cool. You can run the tests by hand or put them in CI.
Gareth also talked about using Gatekeeper, which evaluates policies in a running Kubernetes cluster. As I mentioned, I think OPA is kind of a big deal and it’s worth looking into.
The last breakout talk I watched was one I was super looking forward to from the title, “10 Weird Ways to Blow Up Your Kuberbetes” (view video) from Melanie Cebula and Bruce Sherrod of Airbnb. It didn’t disappoint. There was a lot of real talk about failures they experienced, and I appreciated them sharing the stories. If you are into Chaos Engineering you will likely enjoy the talk and learn some things too.
The closing keynotes kicked off with Vicky Cheung of Lyft, the conference co-chair, talking about new things in Kubernetes 1.16 (view video). I’m most excited about Ephemeral Containers, which allow you to attach a container to a running pod for debugging purposes. Now you can build a container with your favorite debugging tools and spin it up when you need to dig into something. Very cool 🤩
Ephemeral containers are a very important update in the latest release of Kubernetes. Debugging apps and pods running in Kubernetes is a challenge, but ephemeral containers can help. #Kubecon19 #KubeCon #CloudNativeCon pic.twitter.com/EDRFwIKFGU— Kaslin Fields (@kaslinfields) November 20, 2019
I was also really interested to see Sarah Novotny from Microsoft and Liz Fong-Jones of Honeycomb.io talk about OpenTelemetry (view video). The short version is that OpenCensus and OpenTracing have decided to combine on one standard, OpenTelemetry, that should clear up some of the confusion in the tracing space.
Kelsey Hightower’s keynote (view video) was very powerful and one of the highlights of the conference for me. I’ve known Kelsey for a few years and have seen him speak about Kubernetes a lot, but this talk was more about the community than tools. He told a story about standing in a circle having a conversation and a new person walking up to join in, which was a great analogy for building communities. I was legit fighting back some tears by the end.
The day wrapped up with a booth crawl in the sponsor showcase area. I spent some time at the Gremlin booth chatting with folks and it was a lot of fun. I find Chaos Engineering super interesting to talk about, and it’s great to hear what people are doing.
After a long day I headed to my hotel to play a bit of Pokémon Shield and get some sleep! 😴
The day 2 keynotes featured Vicky Cheung talking about how “Everything Worked Before Kubernetes.” (Narrator: It did not.)
I really appreciated this look back. It’s easy to wax nostalgic about the old days, but we have advanced a lot in the industry. I saw one of Kelsey Hightower’s talks about Kubernetes years ago, and one thing that hooked me right away was that a lot of the operational practices I’d already been using were baked into the platform. You get things like health checks for free, without having to figure out how to implement them.
One of my favorite keynotes of the day was Tim Hockin of Google and Kal Henidak from Microsoft talking about how IPv6 was implemented in k8s. A lot of people came together to volunteer and make it happen, and the story was great to hear. I remember hearing horror stories about the politics involved with OpenStack, and I’ve wondered how well the big companies involved collaborate. It sounds like they are doing it pretty well based on Tim and Kal’s story.
I attended a couple of breakout talks that I really enjoyed. First there was a talk from Jian Cheung and Stephen Chan of Airbnb called “Did Kubernetes Make My p95s Worse?” When Airbnb moved over to Kubernetes they ran into some latency issues, and the talk focused on how the team debugged them and whether Kubernetes was actually to blame. It turned out that some of the differences were things like Operating Systems, host size, and autoscaling issues. A couple of the examples were also in the Airbnb talk I loved from day 1, but the focus was a bit different here. At the end they had a final scorecard:
I also really enjoyed the talk by Jesse Dearing of VMWare (by way of Heptio) about events in Kubernetes. He showed how to emit and consume events in your cluster, with lots of code examples in Python.
This is a great talk to watch if you’d like to learn more about events in Kubernetes, or see examples of someone coding against the k8s APIs.
After Jesse’s talk I went to an event called Puppy Pawlooza, which was a hangout with a bunch of very sweet therapy dogs and their kind volunteers. I really have to thank the organizers for putting this on the schedule. It was such a big conference with so many people, and it was great to take some time and chill with some sweet puppies. I mean, look at them! 🐶🐶🐶
This was part of a Wellness track that also included a quiet room, running in the mornings, yoga and massages. 🏃🏿♀️
One of the best things about being at a conference is meeting folks you know from Twitter in person. I spent some time wandering around the sponsor area and got to hang out with some really cool folks, including Mark Mandel from Google.
Mark is working on Agones, which is a platform for gaming servers that runs on Kubernetes. I find gaming infrastructure really interesting, and the project sounds rad.
If you attend conferences, the hallway track is often the best part :) Don’t be afraid to skip a talk and hang out with awesome people.
The evening event was a block party across the street from the convention center. They had blocked off several blocks of what’s called the Gaslamp Quarter, which was a fun idea. Unfortunately it was raining pretty hard as we were trying to walk over the event. People were ducking under awnings for shelter and stepping in puddles. My feet were pretty soaked. But we made it and I got some food, and I had a chance to catch up with lots of friends. All of the restaurants in the area were open and you could just walk in and eat buffet style, which was a very fun idea.
After day 2 I was wet and tired so there wasn’t any Pokémon catching going on when I got back to the hotel. As tired as I was, my brain just didn’t want to shut off after two days of great talks and conversations :)
The day 3 keynotes were probably my favorites of the conference. Bryan Liles of VMWare, one of the conference co-chairs, talked about finding the Rails moment for Kubernetes. Bryan’s talk was about user experience and how the community can make it easier for people to use Kubernetes. I really appreciated his focus on users. It’s easy to get bogged down in technical details and lose sight of the people who will be using the software.
Next up was one of my favorite talks of the conference, Ian Coldwater from Heroku talking about security and the mindset of attackers (view video). Ian talked about how we should view attackers as users of the system, which I thought was a very interesting way to look at it. Features you add for ease of use, like listing all permissions in a namespace or the very cool Ephemeral Containers feature I mentioned earlier, are also things an attacker can take advantage of.
How do attackers think? Builders and breakers look at the system differently. You can list all permissions in a namespace which saves time when trying to attack a cluster. @IanColdwater #KubeCon pic.twitter.com/DOr5iWwxMS— Rich Burroughs (@richburroughs) November 21, 2019
Ian also mentioned the brilliant idea of using Untitled Goose Game as a way to learn to think like attackers. I picked up the game after the conference and it’s a lot of laughs. Also big bonus points to Ian for using lots of images of black women in her slides. I saw several people on Twitter mention that it impacted them.
I was super excited to see Ana Medina from Gremlin and Lenny Sharpe from Target talking about how to bring the joy to Chaos Engineering (view video). I sit on a Developer Advocacy team with Ana and we’ve worked together a lot over the past year. I was so excited to see her on the keynote stage. I’ve also met Lenny a few times and have always really enjoyed talking to him. And this is the only talk I saw on the schedule that was about Chaos Engineering, which made me want to see it even more.
Lenny talked a lot about the work they’ve done at Target to market their Chaos Engineering program to service owners. Evangelizing the adoption of a new tool can be hard, and I love that they thought about marketing. They even came up with a logo for their resiliency platform.
The Target team built TRAP, the Target Resilience Automation Platform. They market their platform and Chaos Engineering to the teams building and running software inside of Target. @LennySharpe #KubeCon pic.twitter.com/K9hwZWMYF8— Rich Burroughs (@richburroughs) November 21, 2019
Ana gave some info on Chaos Engineering and showed how to experiment on a cloud native microservice architecture. Testing dependency failures is a great use case for Chaos Engineering :)
“How does your application handle failures when your dependencies go down?” @Ana_M_Medina asks the essential resilience question. The answer: plan for failure. Understand your dependency graph! #kubecon pic.twitter.com/ShNdeVcFYd— Bridget Kromhout (@bridgetkromhout) November 21, 2019
In the breakout sessions, I really enjoyed the talk from Varsha Varadarajan and Adam Wolfe Gordon of Digital Ocean about a tool they built called clusterlint (view video). When they upgrade k8s cluster at Digital Ocean they provision new nodes and delete the old ones, and the fact that things like hostnames change has a lot of impacts for how Kubernetes needs to be configured. They built clusterlint to check people’s configs before upgrades to find those problems.
This was a very solid technical talk and I really recommend watching it if you want to learn how to talk about technical topics. They stated the problem, talked about design decisions, walked through the tool and did an awesome demo. It was very well done.
Next I attended another of my favorite talks of the conference, Leigh Capili from Weaveworks talking about doing zero downtime deploys on Kubernetes (view video). This was a great talk and contained a lot of information that the average person couldn’t be expected to just pick up on. Leigh was talking mainly about blue/green deploys, where new instances of an app are spun up and then the old ones killed. There are a lot of weird edge cases that can come up in that scenario, like pods still being sent traffic when they are already in the process of being shut down. The fixes included things like adding sleeps to your app :)
Leigh’s concluding slide should give you an idea of what he covered, but I really recommend watching this talk as many of these items are not at all intuitive.
And the talk I watched in the final slot of the conference was Lita Cho and Ryan Cox of Lyft talking about debugging an Envoy service mesh (view video). This is something the Lyft folks have a lot of experience with, and they had a lot of great suggestions. One suggestion was to have outbound third party API calls route through Envoy:
How to set up Envoy to egress to a third party API. You get all of the Envoy stats by doing this, retries, rate limiting, and the ability to specify a log. @ryancox #KubeCon pic.twitter.com/et9JK1cSz0— Rich Burroughs (@richburroughs) November 22, 2019
I highly recommend this talk if you are running Envoy, or would just like to learn more about it. I learned a lot.
I’m very glad I had the chance to attend KubeCon in San Diego. This was my deepest dive into the Kubernetes community, and I came away feeling very optimistic and energized. There are a lot of super smart and nice folks in the community.
Some of the things that I’d like to look into more: Open Policy Agent, the new Ephemeral Containers feature, Octant (an open source tool from VMWare for cluster management), and Helm 3.
If you’d like to see me tweet about future conferences I attend, follow me on Twitter.
It’s the time of year when teams at our favourite brands are gearing up for the Black Friday and Cyber Monday shopping…Tammy ButowPrincipal Site Reliability Engineer