Podcast: Break Things on Purpose | Armon Dadgar, CTO and Co-founder of Hashicorp
Break Things on Purpose is a podcast for all-things Chaos Engineering. Check out our latest episode below.
You can subscribe to Break Things on Purpose wherever you get your podcasts.
In this episode of the Break Things on Purpose podcast, we speak with Armon Dadgar, CTO and co-founder of Hashicorp.
Jason Yee: Welcome everyone to the Build Things on Purpose podcast. This is a nice spin off of the Break Things on Purpose podcast, where we talk to people that are builders, that have created really cool software that hopefully helps us build really cool, reliable things. So joining me today, I've got Armon Dadgar from HashiCorp. Armon, why don't you say hello?
Armon Dadgar: Hey Jason, thanks so much for hosting me. Great to be here, happy to chat.
Jason Yee: Yeah, so how are things at HashiCorp?
Armon Dadgar: I think better than probably for most. We were a distributed company to start for the year. So for us getting from 90 to a 100% not so bad, but it's been a super fun year. We've had two brand new product announcements with Boundary and Waypoint, we got to announce the HashiCorp cloud platform and shifting into managed services with AWS and Azure this year. So it's been filled with a lot of new products and it's been fun, given everything else.
Jason Yee: Yeah. So given the theme of the podcast, you have been building a lot of really fantastic, interesting stuff. And I think most of the listeners are already familiar with the company just from using things like Vagrant, Terraform and all these other great products. So you did mention two new products, Waypoint and Boundary. So tell me a little bit about Waypoint. What is it?
A tool to unify devs, ops, and release engineers
Armon Dadgar: So Waypoint's probably are the weirdest product in our portfolio to describe, if only because I think we're taking a different take, I guess, on application delivery. I think the big question for us, for a long time, for users has been, "What's the developer interface to this stuff?" And what I mean by that is we make a whole bunch of ops tooling, and so we often get asked, "Should developers learn Terraform? And is Terraform the way I deploy my apps? Should developers learn Kubernetes, and they use CRDs or something? Is that how they should deploy it?"
And so I think there's been this question of what's the dev interface? And for us, the answer has always been none of those. Those are all operator tools. They're designed for operations and that's why they're complicated and are powerful. They support all these things.
And so I think Waypoint was our, "Okay, what do you want the dev experience to be?" And what we wanted it to be was one command. It's Waypoint up, like what we did with Vagrant. It was Vagrant up. So with Waypoint up, where we wanted to get to is that's all it takes to take your source code and a small manifest, build the app, whether it's a dock or container, whether that's a VM, whether it's serverless thing. It does the build, it does the deploy, it does the release, and you don't really have to think about it as a dev. You're like, "Okay, cool, it just worked."
Jason Yee: Nice. So along with that, I guess, how does that actually work? As a developer, I think I'm used to understanding how my application gets built, but when it comes to that deploy side, what happens there?
Armon Dadgar: It goes back a little bit to the Tao of HashiCorp, the design ethos. So we talk about the Tao, we've published it before and spoken on it, which is there's a core set of, I'll call it guiding philosophies, HashiCorp uses when we build products. And one of those ones that's the Holy Grail for us, I would say, is this notion of the workflow is always more important than the technology. And the extension to that, if you read the corollary, is that the technology has to be pluggable because there's never going to be one ring to rule them. So I'd say that's probably the most deeply grained ethos for HashiCorp. And so for Waypoint, so what does that mean?
I think the thing that we spent a lot of time and I was like, okay, from a developer's viewpoint, I want super consistent workflow. Right? So that one command way point up that does the whole shebang. Really, you could think about that as it's actually three sub workflows that we talked about; it's Waypoint build to actually do the build and build an artifact. It's a Waypoint deploy to actually do a deploy and get something out that's running and it's Waypoint released to do the release management. So there's kind of these three sub workflows all under one kind of parent release workflow. But then I think that other side of it, that sort of pluggability aspect of it, comes from that technology sensibility is in terms of how Waypoint actually works, which is it's super plugin driven the same way Terraform is, right?
So the same way with Terraform as a consistent workflow, right? It's Terraform plans, Terraform apply. It doesn't matter if it's AWS, Google, Azure, whatever. Then we have these plugins that allow kind of Terraform to reach into all that. Waypoint kind of works the same way where we have build plugins, like build packs and Docker, right? We have deployed plugins like nomad or Kubernetes or ECS or Google Cloud Run. And then we have release plugins, which will manage things like traffic control and traffic shaping and things like that. Right? So it's this different plugins, surface, common core that sort of orchestrates everything and then kind of backing it out into one common developer workflow.
Jason Yee: Nice. So this is really a tool where developers, your traditional ops and release engineers, would all collaborate together and work on their different parts. But the entire experience then is really unified.
Armon Dadgar: Yeah and our thinking was, it's like if you kind of zoom into what we see happening in most organizations, you have your ops team, who's building this platform, they're bringing in different pieces of technology. Maybe they are using Terraform for provisioning and Kubernetes is the runtime and fault for secret management. And then they're building their own, I'll call it, build your own platform, right. They're kind of cobbling it together. And then creating an interface for the developers internally, which might be a make file. It might be a CIC pipeline, might be an internal web portal. And so we looked at it and said, okay, instead of that DIY, could you just have your own configure Waypoint? You give that Waypoint, convince your developer and say, great. Now you're just, you only use Waypoint on your command line or through the UI or whatever, you get that platform experience, but your ops team still controls sort of how the system works.
Jason Yee: Nice. So I'm curious, as someone loves to try different things, and I'm sure a lot of our listeners do as well. If I wanted to try out Waypoint and actually get this running in my organization, what's the number one piece of advice that you'd have for me trying to roll this out.
Armon Dadgar: Ooh. So I will add the caveat that we literally just shipped the zero dot two release. So go in eyes wide open with that. This thing will be a little sharp around the edges. Right? We just released it, so forgive us. Where I would try it, and I think where we so far we've seen the biggest sort of wow factor is some of these sort of, I'll call it, you know, you know, the low infrastructure platforms like take an AWS ECS or an ACI on Azure or Google Cloud Run where you're like, really cool platform technology, but actually a pretty, I'll call it, bad workflow for developers. And I'd say take a simple Waypoint app, like our little Go Hello World thing, and try and deploy it onto an ECS or a Fargate or an ACI. And I think those have been the most fun to me, is watching users come back and be like, Holy cow, I went from downloading Waypoint to it running in five minutes. What? That's amazing.
Jason Yee: That is really cool. Although I would caution you that, I mean, you know, our listeners and you've been in this community for a long time of people that maintain things. You're telling us, "Oh, this is still early, early on it's 0.2. But I should remind folks that tools that we've been using all along for years, like console and vault are still beta", right? So you build such great software that people will use it. So I'd imagine that next year, this time, even if you're still not 1.0, there's going to be a ton of people already using it.
Armon Dadgar: Yeah. We're so close now. The only product left, that's still not a 1.0 is Terraform, but we're going to get there.
Jason Yee: That's the other one. Yeah, not a 1.0 and I still use a ton of Terraform
Armon Dadgar: Only took us six years to get here.
Lowering the friction of security
Jason Yee: So onto that, the second product that you had just released, you mentioned Boundary, what is Boundary?
Armon Dadgar: Yeah. So boundary was us really taking a look at, how do we get access to private resources, right. And what I mean by that is, great, I'm a developer. I need to access a private web service that I'm working on, or I need to connect to a database that's running in my production network. How do I actually do it? And I think part of what colors this for us with so much of our time, we spent also with our enterprise customers, where I think it's, their environment is even different looking than if you say, how would I do this at a small startup? Right. They've got a small startup, probably easier. Everyone just has an SSH keys to production. And you just kind of jump into whatever you need. I think when you look at the backdrop of like an enterprise setting, way more controls, way more overhead in terms of the hoops jump through.
Right. So I think what we tried to do with boundary was to say, great, what do we want the experience to be? I want you to basically be able to sign on through some single sign-on maybe your Get Hub identity. Maybe you're using Google domain, maybe you're using Okta and the enterprise, whatever it is. I want to be able to see a catalog of what am I allowed to connect to be able to basically double click it right. Or run one command through the CLI and then be connected to whatever that private resources and then that's it, right? Like that's how easy I want private access to these different things to be. But I think it's deceptive in that all the things that you sort of have to solve for to actually get something, an experience like that to actually work.
Jason Yee: Yeah. That's interesting. Cause I mean, one of the things that HashiCorp has always done is take that, that developer, that engineer perspective. So I think when Boundary was first released, I did that, Oh, this is interesting. Are you getting into the security space? But it sounds like it's more security is necessary, but it's really still an engineer's tool just to have access.
Armon Dadgar: Yeah. And I think what we see right, as it's like, if you go into a slightly larger setting, right. Even, even this way at HashiCorp, we started having to have these controls, right. Because you start selling to customers who want you to have, are you PCI compliant? Are you SOC compliant? Right? Once you try and hit those things, you have to bring these controls into your environment. And so I think you look at it and you're like, okay, the average developer who has, let's say access to the system. First, they need a VP on it. Right. So they have a set of VPN credentials, or maybe you have to distribute certificates to them. That's a whole pain, cause now you need a separate process for generating and distributing these certificates and they have to expire. So it's like the whole thing is kind of a nightmare. Then you're like, okay, cool.
You VPN into my private network for, let's say Hashicorp Cloud. I don't want the people who are VPN to accessing any random thing within our production environment. So great, now I have to put a bunch of network controls around it as well. So it's like, okay, if you're coming through the sub-net or from this IP, what can you actually have access to? Right. And third, if I actually want to do things like do session recording of great, our mind connects it to the database, what queries to be run to the exfil user data. Now I need a third jump box that I'm going through that's actually doing that kind of session recording. So I have forensic information because otherwise I'm telling my customers, I have this forensic information, so I better have a control in place. So you look at that and say, okay, I actually had at least three or four different controls, right?
You had a VPN, you had a firewall, you had a privileged access management system. You need an username and password to actually get to that database. And you're like, so what's that developer experience look like. You're like, and that thing is terrible. There's so many different clients I need, but so many different hoops I'm jumping through just to connect to this database versus what we tried to do to say, can we compress all that down into the boundary workflows so that, great, it sources the credential for you from vault automatically. It discovers the IP address from console automatically. It ties into your identity based on your single sign-on automatically. How do we make that experience super, super basic for the developer and the operators. We're trying to do this all the time.
Jason Yee: Yeah. I like that. And again, I think I like it simply because it is that engineering workflow. Right? You've you've got the engineer in mind. I am curious though, because it does tie together so many different pieces, obviously a lot can go wrong. So back to that, that same question that I'd asked you about Waypoint. If I wanted to implement this or one of our listeners did, what's the number one piece of advice? Cause it sounds like this thing could go sideways really quickly.
Armon Dadgar: Yeah. It's a good question. You know, in some sense I would think about it with a similar level of, it's just as important to your infrastructure as your VPN or your SSH Bastion Host was, right. Cause this becomes a sort of node at the edge of your network that allows you to actually get in. Right. So it's like, if it goes down, you could lose access to your production environment. So it's one of those things where it's like, you want to spend that little bit of extra effort and we've published reference guides and documentation, like what's the deployment architecture we recommend, but it's one of those things where it's like, you want it to be HA because you're like, if you lose the one junk box that lets you get into production, you really regret not having that second host that was going to be there to do the fall over too.
So I'd say spend that little bit of effort in terms of like, what does a reasonably production grade HA grade deployment of us actually look like? But I'd say because we're Hashicorp, because we're sort of deeply in the operations size ourselves, we've spent a bunch of time on how do you make sure this thing is operable and not a nightmare. I have set up open VPN servers in my lifetime. I know what that pain is like, and we didn't want that to be the Boundary experience. So it's designed not to be an operational nightmare, unlike some of those other systems, but still. It is a security sensitive software. It is going to run at the edge of your network and give you private access. So, worth actually setting it up correctly.
Jason Yee: Yeah. And I'm assuming that all of that HA configuration and in that fail over stuff is rather built-in, right?
Armon Dadgar: Right. Yeah. No, I mean that for us, that's a 0.1 requirement of the design, the design of the system has to support this out of the box.
Jason Yee: Nice. Well, thanks for joining us and sharing these really cool tools with us and things that you've built. Absolutely excited to try some of them out. So for our listeners, if you do manage to try them out, feel free to reach out and let us know how it goes. Thanks for joining us.
Jason Yee: Yeah. Where can I find those forums?
Armon Dadgar: You can find that through just discuss.hashicorp.com and then, we link to all of them from the community page on all the open source sites as well.
Gremlin's automated reliability platform empowers you to find and fix availability risks before they impact your users. Start finding hidden risks in your systems with a free 30 day trial.sTART YOUR TRIAL
What is Failure Flags? Build testable, reliable software—without touching infrastructure
Building provably reliable systems means building testable systems. Testing for failure conditions is the only way to...
Building provably reliable systems means building testable systems. Testing for failure conditions is the only way to...Read more
Introducing Custom Reliability Test Suites, Scoring and Dashboards
Last year, we released Reliability Management, a combination of pre-built reliability tests and scoring to give you a consistent way to define, test, and measure progress toward reliability standards across your organization.
Last year, we released Reliability Management, a combination of pre-built reliability tests and scoring to give you a consistent way to define, test, and measure progress toward reliability standards across your organization.Read more