Natalie Conklin, tamer of chaos and Head of Engineering here at Gremlin, joins us to talk about embracing change, working alongside each other, and building more reliable systems. Natalie has a talk coming up at DevOpsDays Boise which she has titled “Embracing Change Fearlessly.” Her talk is oriented around enabling teams to take calculated risks and having the guts to take those risks. Natalie spent time working in India, which helped solidify her “fearlessly” philosophy. She provides examples of taking chances and how teams can orient around them together, the cultural changes that need to happen in chaos engineering, how to communicate across whole teams, dealing with “adolescent” engineers and more!

Episode Highlights

In this episode, we cover:

  • “Embracing Change Fearlessly” (01:45)
  • Fearless change enabling good work (04:00)
  • The culture change that needs to happen (06:10)
  • How to talk to your leaders (10:45)
  • “The Adolescent Version” of engineering (14:40)
  • How Natalie prioritizes time, speed, and efficiency (18:42)
  • Natalie’s keynote (26:48)

Transcript

Natalie: I like this—I call it the adolescent version of engineering. It’s where, you know, we’re through the baby part, we need to start to grow up a little bit, we need to go from getting stuff done in some way or another, to something that’s repeatable and scalable. And so, it’s like, that adolescent years, that’s my fun. That’s what I enjoy doing. I call it creating something out of chaos.

Basically, taming the chaos is what it really looks like because it’s very chaotic initially, and that’s true of every, like, small organization; they always start like that. And as they start to grow, you know, you’ve got ten different engineers who have ten different opinions on how something should be done, and so they do it ten different ways. And that’s fine when you’re only ten, but then when you need to go from 10 to 20 to 30 to 100, it no longer works.

Julie: Welcome to Break Things on Purpose, a podcast about reliability, culture change, and learning from failure. In this episode, we talk with Natalie Conklin, head of engineering at Gremlin, about the importance of embracing change, and how we can all work through our fears and work together to build more reliable systems. Natalie, I’m so excited to have you here with us today. And today is actually a really big day because it is the fifth year of DevOpsDays Boise, which you are doing the closing keynote for. So, really excited to have you both on the podcast and at the conference today. And your talk is titled “Embrace Change Fearlessly.” So, do you want to kick off by telling our listeners a little bit about you and what you’re going to be talking about?

Natalie: Sure. Thanks for having me. I am excited about both, sort of, [laugh] which is exactly what the talk is about. [laugh]. The talk is really about being able to embrace change fearlessly, and that it’s rarely ever fearlessly truly, but mostly around being able to do what makes you afraid anyway.

I’m not a big public speaker, so that’s something I’ve had to work hard at trying to be able to be more comfortable doing. And so, this is an exciting time for me. But background-wise, I am the head of engineering currently for Gremlin and had been leading engineering teams for growth companies for just over a decade. And a lot of what I end up doing centers around this: It’s helping those engineering teams be willing to move forward in risky—because in growth companies, a lot of times you’re building things that are brand new, this is not something that, you know, has been out there and done, so they typically have to do something new for the first time. And so, being able to take calculated risks is tough. It’s hard stuff. And so, getting into the right mindset to be able to push through that, that’s a lot of what I ended up doing.

Julie: I love that. And that’s actually a really good point that you’re bringing up, you know, growth companies and being in the right mindset. So, one of the things you and I talked about when I was starting here at Gremlin and getting to know you a little bit about your background, which is really cool. You lived in India for a few years, correct?

Natalie: I did. I lived there for two years. I was working for a company, we were doing big data analytics for telcos, building big, large platform that we would then do some custom development work off the top of for these various telco companies. And the team over there had experienced some turnover, and so there was a lot of quality issues and things of that nature starting to show up for the first time. This had been a very rock-solid team, honestly, and so the company asked if I would be willing to go to India to figure out what was going on. And so, that was what I did. It was a great opportunity; loved doing it.

Julie: So now, as you work with teams to embrace change fearlessly, and we talk about you mentioned the ROI and doing things in new ways and building new things, do you have an example of maybe when you built something new or your team built something new, and it changed the way we work?

Natalie: Well yes, an easy answer would just be to fall back on the India example for a second, right? So, a lot of what I did when I went there was they were a very waterfall shop, converted them over to Agile practices and DevOps. They had really none of that practice existing. So, when you ask the company—or the, I’ll just say the team to go through that sort of transition, you’re pretty much asking them to change everything about the way they work. And we focused a lot more —there was a lot of manual processes that they had been doing previously and we were automating all of those had to do the automations, but then also, you know, make sure that work fit into this new automated way of doing things.

They also had, just, also the trepidation over am I going to still be needed, right? Those are all those things that come into your mind when you’re basically changing from a manual process to an automated process, “Am I still going to be needed? Is my work going to still be important? What am I going to do in this new world, in this new environment?” There’s a lot of that that pops up into people’s heads.

So, a lot of making the change successful, there’s certainly the technical aspects of getting it automated and all those things, but to really make a change successful on that kind of scale, it requires getting people to think about it differently and to be okay, and to realize that they can learn new stuff and they’ll come out of this better than how they went in. And a lot of that takes a lot of, just, communication and talking, being very personal with people, making sure that they personally understand how to do this, but then just also, things like training and coaching and making sure that there are people there to counter the negative energy that comes along with change. There’s always negative energy that comes along with it, people are nervous, they’re scared, and you have to be able to counter that in some way.

Julie: You know, there was a talk I gave a while ago, and I’m trying to remember the name of it, but one of the things that I talked about was the Pareto Principle, which is, what, 20% of people are going to be amazing in an organization, 60% are going to be, you know, middle of the road, then you have that bottom 20% that are going to kind of fight that change. And you shouldn’t really necessarily focus on that top 20%, but you should put a lot of the focus on bringing that bottom 20% along with you. And we talk a lot about just the cultural change that needs to happen when we talk about Chaos Engineering, for example. I mean, there’s a huge cultural change that organizations need to switch that mindset into embracing failure. Which we talk a lot about, but it’s hard for folks to embrace change fearlessly, embrace failure fearlessly.

When you’ve been going through these experiences in the past—and you mentioned that you really need to think about the people—what’s one of the common fears? You said, you know, people worry about their jobs and worry about being left behind. Work us through how do you help folks with that?

Natalie: Yeah, I think that’s actually one of the most interesting aspects of this. When you start looking at—when I [start talking about 00:07:18] about, you know, people don’t change, when it’s something that’s personal like getting married or having kids or going off to college, you know, these are all huge life changes, and we celebrate those, we have parties, we’re super happy, we think they’re fantastic, right? And I mean, if I go back to India for a second, these are the same people that are struggling on, you know, the fact that I’m going to change from a manual testing to an automated testing, will actually go through an arranged marriage where they’re marrying someone that they don’t know super well, but they’re very happy about it, right? So, that’s one of the things that I like to point out and have a discussion with people about is that you’re not afraid of change; you’re afraid of change in your work life, right? And we have to be very specific about that because we start talking about humans are afraid of change, I actually don’t agree. I think we’re just afraid of changing what we do at work.

And usually, that’s because that’s somehow tied to our needs pyramid, right? Like, that’s how we get our needs met from food and shelter and all of these other kinds of things. And so, when we start to threaten that, it gets really, you know, sketchy for a minute, right? So, that’s when we have to, like, take a minute and realize what we’re doing and realize that we’re being overly protective of a part of our world that, you know, we somehow feel like it’s going to then have us begging on the street, is the example I give in my talk, right? That’s not going to happen. Like, you know, that’s just an irrational fear.

And it’s highly unlikely that that’s your right answer. So, what I encourage people to do is to actually find a logical, kind of, sounding board, person, a mentor, a friend—and again, if you don’t have this person in your life, then you know, find that person, but start talking to them about, like, what’s most likely to happen in this scenario? Or, better yet, what can I get out of it? I think if you spent less time on that and spent, you know, more time on, like, what can I actually get out of this, how could this benefit me, and sort of flip that in your brain.

Because what our brains are incredibly good at doing is going down that worst possible path. But the real truth is, we’re just as capable of imagining the good. It’s just a matter of focus. So, why don’t we just focus on that instead? We can focus on what’s the positive part of this, what could happen, and we’re actually much more likely—there’s a whole lot of studies around manifestation—and we can manifest that in our life if we want to, right? So, we just need to focus on the positive side of it.

So, I—literally it’s honestly a bunch of personal conversations, and getting people to just calm down and realize that the likelihood of their worst-case scenario is not really real. And then start to think through, okay, what can you actually learn from this? You know, is there something that you would like to get out of this? Would you like to try a new role? Would you like to try to lead an initiative? Would you like to be part of this in some way, right?

So, those conversations—and again, it has to be personal. That’s the thing that I think, you know, when you start doing widespread, full organizational changes, which I was doing over there and I had 120 engineers, it’s hard to do it personally because you literally have to have one-on-one conversations with everybody and understand what they are going to get out of it. But that is what’s required. I think, to really get people to a comfort zone, you’ve got to make sure that they understand how they fit in, and their why; why they’re doing it.

Julie: And that is all amazing. Now, as the leader, as the head of engineering and an organization, how do you recommend individual contributors talk to their leaders? Or how do they bring up concerns in a way that’s productive in an organization? Because I know for me, sometimes—and you’re right, I am excellent at going down that every possible negative outcome path; I’ve planned it out pretty well, to my peril, but that means that when I bring up concerns with leadership, I tend to do so in a heightened emotional state. So, what’s your advice for folks?

Natalie: Well, and it’s just that. I think it’s exactly where you’re headed with that is that take the emotions out of it—or attempt to—and try to present your concerns logically. Because there’s going to be situations where what you’re bringing up is something they need to consider, and if you can present it in a logical way, chances are they will, and they’ll take that into consideration. So, I would—like, even if they are going to still move forward with the plans that you’ve somehow don’t agree with, like, let’s assume that some portion of this change, you don’t feel is correct, which is actually one of the most legitimate reasons to worry about this, then what you should do is say, “Okay, look, I have this concern, so here’s the Plan B. But just in case, this doesn’t work. But I think it might not, so here’s a Plan B.”

Like, that’s a way of presenting that in a way that’s not challenging to the situation. So, I’ll give you an example. In the India conversations, one of the things was that I actually did create a Plan B around was the fact that the person was bringing up—I was attempting to have Agile teams where they needed to have very strong ownership, they also needed to be able to self-manage. We talked about self-managed teams in Agile. And India is a very hierarchical culture, and so the thing that they brought up with me is that this isn’t going to work here; it culturally isn’t a good fit.

And frankly, I knew that I was going to—I had this issue it within the company, but was it so widespread within India that I couldn’t possibly change it? I hadn’t lived there my whole life, I couldn’t say, right? So, I needed to actually answer that question. And I thought it was a legitimate question, right? And I thought—but it was presented in, you know, a very factual, logical way, and kind of without the emotions, and so it’s like, “Okay, let me think through that.”

And so, we did this as a—you know, we created an experimental team where we tried this out to see if it would work. And it actually did, ultimately, succeed with that team. And I love this team because —I mean, to be fair, I did handpick who went on this team. Like, I did, you know, try to pick people who I thought might be the most likely to succeed. I’m not crazy; I did want it to work, and so you know, I did sort of seed it a bit.

But at the same time, when they came out of that—and they tend to be a little bit younger than I think some of the, you know—because I think their minds were a little bit more open as part of that, but they came out of that, and after about nine sprints, you started to see the junior engineers challenging the more senior engineers, which in India is not like something that you see all that often. They were also able to —the junior engineers were having opinions, they were contributing to the technical discussions. Like, it was actually a pretty radical shift. But they also kind of walked around with this, like, certain swagger that I cannot describe. But it was, like, super fun to watch.

So, you know, you’ve got to see that this was actually going to work, and it could work. And then it became a really good example, for the rest. So, I think the main thing is to help mitigate risk. If you have a real concern over a change that’s coming your way, and it’s something you don’t feel like the company should do, just understand that they may do it instead and that’s not personal, but at the same time, you know, you can help by offering a Plan B or some risk mitigation to double-check that it is going to work or to help it work.

Julie: Absolutely. It’s kind of that whole testing hypothesis, right? We’re going to see if this works; we’re going to evaluate it. One of the things that you brought up that I love and it was something that when I was at PagerDuty, we used to talk about a lot with the postmortem process, which was to involve junior engineers because they tend to look at things differently with that fresh set of eyes.

Natalie: Right.

Julie: And they kind of get us a little bit—the people who’ve been doing it for a very long period of time—a little bit out of your comfort zone because all of a sudden, maybe you’re having to explain something. Jason and I have talked about this a few more times probably than necessary, but just, “Well, we’ve always done it this way because…” and then having to explain that because. You know, one of the things that I find interesting just from your background is—you know, we’ve talked about this, where you scaled that engineering team from 0 to 100, to deliver on custom software engineering contracts, and you’ve done quite a few things over your career. I mean, even working at Oracle—which we were actually just talking about an Oracle outage this morning—but, driving technical programs. And that seems to be a lot of your background. I mean, even at Facet, that you introduced engineering best practices to standardize code reviews and improve test coverage. Do you want to talk a little bit about that?

Natalie: Yeah, I think—I like this—I call it the adolescent version of engineering. It’s where, you know, we’re through the baby part, we need to start to grow up a little bit, we need to go from getting stuff done in some way or another, to something that’s repeatable and scalable. And so, it’s like, that adolescent years, that’s my fun. That’s what I enjoy doing. I call it creating something out of chaos.

Basically, taming the chaos is what it really looks like because it’s very chaotic initially, and that’s true of every, like, small organization; they always start like that. And as they start to grow, you know, you’ve got ten different engineers who have ten different opinions on how something should be done, and so they do it ten different ways. And that’s fine when you’re only ten, but then when you need to go from 10 to 20 to 30 to 100, it no longer works. And you do have to create some standards and still leave enough leeway for people to be able to have their tool of choice based on, you know, what makes sense, right?

So, there needs to be some pragmatism in there, you can’t just, like, also go the [unintelligible 00:16:54] where it’s just one thing. But at the same time, there is some standards and there is some consistency that needs to be created so that, like, when you’re onboarding a new engineer, there’s not 20 things to learn; you can reduce that down to something that’s manageable and you can get somebody onboard and productive within a reasonable amount of time. Otherwise, that’s difficult, even that becomes difficult. So, every part of it that needs to have some level of standards around it—I think the fun in it, too, is finding that balance between introducing enough process that you have some standardization, you have some consistency, but not so much that you slow it down to the point that it’s no longer moving. Because you can; you can strangle a small organization with too much process.

So, it’s finding that middle ground. And yeah, that’s what I’ve pretty much done, like, my whole career in some form or another; it’s what I enjoy. And if it gets to the point where things become too standard, too stable, to done, then I’m probably… I’m going to need to move on to something different and new. You know, that’s going to be where I go do this again, with somebody else.

Julie: Hashtag #startuplife, right?

Natalie: [laugh].

Julie: [laugh]. That’s interesting that you bring up, you know, going from ten people to more, right, where you can just buy any tool you want and reimburse it, and there might not even be a central repo of all the tools that the organization has, to whittling that down into processes that you own, that you control, versus processes that control you. And then bringing those ten people that were there at the beginning that could kind of do whatever they want because the whole goal is to bring this product to market, to refining that organization and helping build out features in service of the customer. So, when you’re looking at the new things that you want to do or prioritizing your time or the engineering team’s time, what are some of the things that you take into consideration?

Natalie: It’s kind of actually very similar to performance when you look at the performance of a system, right? The engineering organization is no different. You need to find your bottlenecks and then you work from there. And the bottlenecks are different depending on which team that you’re looking at, right? So, I like to start to kind of get a feel for what’s working, what’s not working, and where things are slow, [unintelligible 00:19:15] oftentimes what I’m trying to do is to get some speed, to get some speed and consistency tend to be really big things without losing quality. You know, all of those kinds of—those are the always the buckets, right?

And so, when you start looking at speed, it really starts to look very much like that performance bottleneck exercise where you just start hitting them one at a time until you, you know, you get through the easy ones and then you start tweaking from there. But for instance, I’ll tell you when I first started with Gremlin, we had a very large team and because of that, stand-ups were very huge, there was too much conversation, they took too long, people —actually the odd thing is that you’ll find people have less ownership when the team is too large because they don’t feel like they’re as part of something that they’re making a huge —as much of an impact on; they don’t feel their impact on a team that’s too large, so when you’re organized in such a way that the teams are very large, you tend to lose some of the qualities of Agile that you’re trying to achieve when you’re doing these little small Agile teams, or at least that’s the thought. So, one of the things I did was split the team. And one of the first things that I did—and that automatically started to create a different dynamic within the teams, and we’re starting to see the results of that. And so, I feel like those are the kinds of things that you do.

Like, that was an easy one; we have to do this, like, that first. Now, like, what do we do next? It depends. It depends, like, where, like, in some cases—I’ll take India, for example—there was a lot of tech debt. So, I had some tech debt that I had to contend with and deal with that was—the way it was built, it was built with this very huge monolithic-style service, and I needed to help them start breaking that into smaller services, mainly because—and they were such a large team, and it was still a monolithic sort of situation, the problem was actually more so than the performance because they had tuned the heck out of that, so that wasn’t it.

Like, the data was very large, so they had already dealt with performance. But the conflict within the engineering teams was a lot because there was so much coordination. And so, by being able to split this up into services that make sense, then the teams can start to own the services and be able to deliver on that with some speed without having to coordinate so much. And every moment of coordination costs you time, right? So, that’s the type of things that you start to look at.

And it could be a technical solution, like in this case, it was breaking the technology, from an architectural standpoint, down into something that make the teams operate differently, or it can be splitting the teams itself without changing the architecture. It can be any number of things. But really start to have to look at what’s causing this to go slow.

Julie: Now, I love that because when everybody owns everything, nobody owns anything, right? And you talked about breaking the teams down into service teams that makes sense. And so, it sounds like it was incredibly intentional; owning your services all the way through into production is really helpful with that speed and that quality. And you mentioned that briefly earlier, which is—what is that? The iron triangle, or whatever they call it, but speed, cost, quality. There’s three things; you can only have two. Which two do you pick?

Natalie: Right. [laugh]. Exactly.

Julie: And I’ve seen that titled as a fallacy saying that you can really have all three, but I don’t really know. What do you think? Speed, cost, quality, can you have all three?

Natalie: Well, so you can maybe have speed, cost, and quality, but if you throw scope in there, [laugh] and you throw that into your [unintelligible 00:22:41], right? Like, because [unintelligible 00:22:42] where you have to start throwing that in. Like, if you look at—so, you know, the triangle that we tend to look at is the time that you’re going to deliver it in, the scope, and the price. Those are the three that I think you can only hold two of. You can go—so by speed when you say speed, cost, and quality, if you go back to your you know, your original one, depends on what how you define speed on whether or not you get quality out of that, right? [laugh].

And so, when you say—but when you start putting deadlines on things, then yeah, you can get quality so long as I can control the scope, right? Because then I can scope it down enough that I can deliver something within that timeline that is of high quality, right? So, those are the trade-offs that you have to make? And no I don’t —I still feel like in that particular three-legged stool, you know, there’s only two of those you get, that somebody else outside of your organization can handle. You do have to —otherwise, you know, you can’t possibly deliver everything in the world within a really short timeframe and expect the quality to be high.

Julie: Yeah, wouldn’t that be nice if you could, right? But that’s why we talk about learning from our failures. That’s why we talked about Chaos Engineering and understanding our systems. Because in all reality, we do have timeframes that we need to get things out, and we have to make our systems as reliable as possible. But then where do we find the gaps that we may have missed because of speed, because of that timeliness?

Natalie: Well, and when you start looking at things like, you know, quality, there’s certainly things that you can do, but if you go back to Chaos Engineering—we talk about that for just a second, and we look at the changes that people are afraid of. What happens when you go in and you tell a place, “To improve your quality I’m going to actually start shutting down your host.” They’re like, “I’m sorry, what?” [laugh].

Julie: [laugh].

Natalie: That’s a very difficult conversation, right? So, I feel like it’s one of those things where once you see that and why you would do it and then, like, you make the adjustments to that, and then it becomes a part of your—doing this sort of change is actually, you know, something that you just do on a continuous basis; it’s no longer something that you’re afraid of, right? And I think that’s true of just [unintelligible 00:24:48] in general. Like, you know, once you start getting into the habit of it, whatever that habit might be—and automation, by the way, is one of those things—and whether it be automating regular tests, whether it be automating Chaos Engineering tests, like any of this automation, that’s actually a key to speed with engineering. And the reason for that is because those are so closely linked.

I go back and I talk about automation and confident mindset. This is really the two things that give you speed in engineering organization. And the reason is because if you can automate it enough, you can—you know, obviously there’s just some speed that comes from automation, you know, that you’re not doing things manually, that’s great. But the thing that you miss in that, or that you don’t necessarily think of, is the fact that there, like, an automated safety net under you, like, through testing, through, like, you know, the systems-level testing, Chaos Engineering, you know, the engineers now feel more free, they’re more confident, they’re able to make changes at a much more rapid pace. It feels less risky because they’re able to make this change and then they know that the tests are going to catch them, right?

So, if they’ve screwed something up, something else is going to stop it before it heads to production. So, they’re just more—they’re able to just move forward at a faster pace than they would otherwise, right? So, that automation, the speed that you get out of it goes far beyond just you taking the manual process down to an automated one; it’s creating the safety net that gives them the confidence to just move without thinking. And that’s huge. Like, that’s a big deal.

It’s also—back to your thoughts on junior engineers—it’s also why I think it’s really important to make sure there’s people in the engineering team who [unintelligible 00:26:26] three years, like, three years of experience. It’s like you know enough that you can make really good progress and you can be useful, but you don’t know so much that you’re afraid. Like, there—laugh] because that confident mindset I’m back to, it really matters. Like, it makes such a big difference in the teams that will move quickly and teams that will not.

Julie: I love everything that you just said. And I just saw a tweet from Kelsey Hightower that he tweeted just a couple of days ago; I saw it just before we recorded this. So, he said, “…as an industry we’ve been pushing… Automate. Automate. Automate. And we haven’t been saying… Understand. Understand. Understand. Because if you understand what you’re doing, you can automate it if you want to.”

And I think you just touched on that. And I think you touched on a lot of the having confidence, that what you’re doing—that there’s safety and even if there are failures, that they’re going to be caught. And I think that all ties together beautifully. Now, with that, because I do realize that we are running out of time, I just want to say, so for you, you are giving the closing keynote today at DevOpsDays Boise. And we’ve talked a lot about overcoming fear during this podcast, and I know that this was something that made you a little bit uncomfortable. Can you tell me why you chose to do this? Why did you choose to overcome this fear?

Natalie: Because of my position and the fact that I’m female, I get offers. And I just made a deal with myself about, you know, a few months ago that said, you know, I wouldn’t turn these down. And primarily it’s because I feel like it’s important that at least some women are out there and are serving as examples for others. Like, I’m not saying that I’m going to have, like, the best things to say all the time, and I think that’s okay. I don’t think every man that comes on a podcast has the best things to say either, right?

So, I feel like it’s just one of those situations where we need examples for ourselves, and I think it’s important that, you know, we see ourselves in the—in what’s—in what’s, I guess, the speakers and the participants, right? And so, I want to make sure that I do my part in that, I guess.

Julie: Well, thank you. And you heard it here first, folks. If you need Natalie to speak at your conference, she made a deal with herself [laugh] that she would not say no. We’re really excited to have you both on the podcast and speaking at DevOpsDays Boise. So, thank you, Natalie, and thank you for joining us on Break Things on Purpose. And good luck on your talk today.

Natalie: Thank you. Appreciate it. Enjoyed it. [laugh].

Julie: Have a wonderful day.

Natalie: You too.

Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called Battle of Pogs by Komiku and is available on loyaltyfreakmusic.com.

No items found.
Categories
Julie Gunderson
Julie Gunderson
Senior Reliability Advocate
Start your free trial

Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.

sTART YOUR TRIAL