Towards a Better Dev Environment

One of the things I encounter frequently in employment¹ is “our developer environment is bad”. What “bad” means in this context can change in the specifics, but the main thrust is the same: doing something in our codebase feels harder than it should be or takes longer than it should.

I have seen many attempts to address this problem. Usually they fail. Sometimes the fixes are surface-level and only address the symptoms but not the causes. Sometimes people don’t believe things can be better and refuse to try, or to let others try. Many times the dev environment’s woes are simply a manifestation of the suboptimal social conditions of that organization. Sometimes people burn out on the problem space to the point where there’s no appetite to even discuss it anymore, where the attempt at solving things hurts more than the dev environment did.

I have, rarely, seen this actually work. When it does work, it’s magic. Within a quarter, it’s obvious that developers are markedly more productive. A step-change has been unlocked for the organization. The local maximum has shifted upwards.

I think a lot about what the times I’ve seen it work have in common. Sometimes they have no state to manage, which eliminates an entire class of problem, and it turns out you’re more likely to succeed at things when they’re easier. Sometimes they have tiny teams, or teams with lots of experience and clear boundaries, and it turns out you’re more likely to succeed at things when their scope is narrower.

But I don’t think that’s it.

I think the most predictive indicator I’ve found for whether improving a dev environment is going to go anywhere is whether the people using it are willing and able to change how they write software.

Dev is just another environment

A decade ago this idea was popularized that perhaps it wasn’t a great idea for engineers to work on features and then throw them over the wall at ops people who were responsible for running them. That maybe devs needed to be running the stuff they write, partially because then they can’t externalize pain into the faces of an operator. If something is easy to write but painful to run, you’re way more likely to do it if you’re not the one running it.

We called the whole thing DevOps, and it lasted like 0.3 seconds before job postings for DevOps Engineers started happening and the whole point was missed. Anyways.

Sometimes people see the dev environment suffering, correctly point out that it’s nobody’s responsibility to make it not suck, and suggest that a dev tooling team should be made to take ownership of it. I am always against this proposal. It always ends in a monstrosity of Docker containers and shells scripts as your developers try to keep a whole Kubernetes cluster and a bunch of databases running locally. It’s easier to run this in prod if we outsource file serving to S3, and it’s easier to write it if we assume it will always get outsourced to S3, and we don’t need to care about how that will happen on developers’ laptops because that’s someone else’s job. And so some poor bastard is finding an S3-compatible Docker container to run and then needing to write guides to resolve permissions issues and how to configure clients for it.

I do not think that you can make any environment, be it dev, prod, or staging, better, faster, or more reliable by creating a team of Sin Eaters to try and wrangle it.

The cloud cannot save you

Sometimes people see the abominations on developer laptops and think “hey, we run this kind of stuff in prod already, we’re pretty good at it, let’s run a copy of it for dev, too.”

Unfortunately, prod is expensive. Also unfortunately, you have a lot of developers.

Multiplying your infrastructure costs by your headcount will make your CFO hunt you for sport.

“Okay,” you think, “we’ll cheap out. They can make do with smaller instance sizes. They can share a database.”

Congratulations, now dev is slow again. Also, people can’t develop on an airplane. Also, people can’t work if their home or the office loses internet. Also, if someone makes a mistake that borks the database, enjoy the lost day of productivity for all your engineers. Also, good luck securing that environment where unreviewed code runs, on the internet. Also, you need to keep that whole thing up to date and stand up new services and keep them online any time prod changes. Also, sucks to be a developer if you need to insert a breakpoint to debug.

You’ve not solved your problem, you’ve shifted it to the cloud, created new ones, and are now paying for the privilege.

There is a way out of this madness

Unfortunately, the way I’ve seen this work equates to eating your vegetables. There is no magic trick I can give you, no fancy technology I can sell you that solves this.

You need to, as an organization, decide that a dev environment that works is important to you. That it’s more important to be able to move fast in general than it is to get a speed boost in any specific case.

You need to trust your engineers to be capable, and set up the environment and incentives that will let them be capable. The answer to introducing a new tool or paradigm isn’t to paper over it with a bunch of half-assed shell scripts, it’s to provide the learning on-ramp to make it digestible and the infrastructure to make its community-supported tooling approachable to your org. It is upsetting the number of times I have paired with a more junior developer as we blew three hours together tracing our way through shell scripts that had accreted over an open source project until we found the actual thing the project was being told to do and could Google it. Learning is part of the job, and we cannot keep pretending that it isn’t.

The question “and how will we run this in dev?” needs to become a refrain. Every time a new dependency is introduced, the question needs to be asked. Sometimes the answer will be to use dependency injection to never actually run the dependency in dev, but a substitute that is trivial to run. It is annoying and error-prone to run an S3 server in dev; its is straightforward to write a file to disk and serve it. Sometimes the answer will be to use a Docker container or even a cloud service. This needs to come with acceptance of the associated cost of teaching developers to understand and use this dependency appropriately.

Sometimes the answer will be “not every dev needs to run this, and we will architect in such a way as to ensure they don’t need to.” And that is okay. The amount of time I have spent helping developers fix dependencies and subsystems in their dev environment that were wholly unrelated to the thing they were actually trying to do is criminal. Write your software with clear boundaries and explicit contracts and stop trying to run a copy of prod on your laptop.

The maximums, they are local

You can have nice things. You need to decide that nice things are possible. You need to decide that you deserve nice things. You need to decide that you want nice things.

If you want a better dev environment and don’t seem to be making any traction, consider whether the way you write software is conducive to having a better dev environment.

Footnotes

¹ It was a calculated decision to write this when I had no employer and therefore nobody in specific I could be subtweeting. ↩