Bazel is too pure for this world
At the 10-year anniversary of Bazel's announcement, I reflect on my disappointment that Bazel never became a viable build system for small-time development.
A few weeks ago was the 10-year anniversary of Bazel’s public announcement. An engineer from the founding team published a retrospective outlining the history of the project.
As a Blaze-pilled former Googler, I originally wished Bazel would become a good build system for every language. I didn’t want it to be the only build system for each language. I didn’t care if it was the best one. I just wanted to be able to easily use it everywhere. But it never reached “viable” in many situations, for reasons that I’ll get into below. I’ve come to think that Bazel is for three groups of people:
Blaze-pilled former Googlers1.
Companies that are large enough to staff a build infrastructure team.
Companies with insanely slow builds or tests.
There are certainly other people who use it. They could probably live without it.
Why are its use cases so narrow? It’s because Google is a separate branch of engineering evolution. If Google hired an experienced engineer off the street today, they would think, “why is all of this so alien? Why can’t I recognize any of this?” It’s because Google is an old company by internet standards. They had to invent their ability to scale as they went along. So they invented a bunch of new engineering, and then Google ossified around the engineering. So now onboarding into Google is learning this ossified way, the Google way2.
Bazel was born from Google’s build infrastructure. It has the same virus: Bazel makes you do things the Bazel way. This is a huge problem for Bazel. The world has drifted away from Bazel’s happy place.
Bazel wants every target to be fully specified. It should have an exact set of inputs and a well-defined configuration. It should be able to clearly define all of its outputs ahead of time. And everything must be reproducible: the same inputs should always produce the same outputs. The build system also wants to touch every directory of your project. Ideally, you would define per-module BUILD files and maintain them over time.
In practice, the world has moved more aggressively towards “convention over configuration” and not worrying too much about the details. Take Go for instance: you can just run “go install
” or “go get
” or “go build
” and it Just Works. Or building up a Next.js directory structure.
And the world has also moved towards command runners and layers of compilers. For example, a modern frontend project would not have a monolithic compiler that understood how to use every single layer of the app. You might have the Tailwind compiler that dealt with the Tailwind, then a SCSS compiler that dealt with the SCSS, then some React compilers and plugins, and throw in a “css-in-Javascript” plugin. And then all you need is to run the correct npm run command and then your build system will invoke all of these compiler layers under the hood.
Bazel has tried to adapt to this over time. For example, additions like Gazelle, the modern frontend rules, and rules_foreign_cc acknowledge that it’s difficult to build and maintain BUILD.bazel files, and additionally that most things work with external build systems that Bazel must interface with.
But yet, every time I use Bazel, I fall into some corner case that just doesn’t work, and then I’m arms-deep in someone else’s rule trying to figure out why the quoting is broken or something like that. I always hated running into some obscure command quoting problem when using a rule, or writing patch files to imported third-party libraries to make them work in the hermetic build environment, or discover that you are downloading a third-party dependency that doesn’t always have a consistent hash, or upgrading Bazel and getting a host of deprecation warnings3. I just don’t run into these problems when I don’t use Bazel.
Another common problem I’ve had: I work on OSX about 70% of the time and my Windows gaming desktop about 30% of the time. Thanks to a job I had 15 years ago, I’m comfortable developing in Windows and would rather not use WSL unless forced to. And Bazel always forces you, since many rules are powered by genrules — glorified Bash scripts — and many libraries do not include the Windows-equivalent commands. If you think “lul who uses Windows?” you may be surprised to learn that the answer is that a plurality (and almost a majority) of developers use Windows professionally.
And empirically, developers often don’t want Bazel. How do I know? I went through all of the open-source libraries on Bazel’s “Who's Using Bazel” page. Some of them are abandoned micro projects that they never should have listed. Even worse, some of those projects are abandoned micro projects that had already deleted their Bazel files by the time they were abandoned.
But I want to draw your attention to the following two projects:
https://github.com/google/nomulus, Google’s cloud service for operating TLDs. Bazel was removed in favor of Gradle.
https://github.com/kubernetes/kubernetes, the library that Google open-sourced which became the popular orchestration layer we all know and love. Bazel was removed in favor of Make.
Nomulus is owned by Google. Kubernetes started at Google. If anyone could figure out how to make them work, it’s these two projects. But it turns out that Bazel just isn’t a killer feature here, relative to “we have lowered the barriers for high-quality contributions to our codebase.”
In an open-source context, you should just follow open-source conventions. Nobody wants to learn the BUILD.bazel
syntax. Nobody wants to know what a genrule or a custom rule is. It’s a Java project. They just want to invoke gradlew
or npm run
or make
or pip
or whatever people typically do.
But oh man, if you’re in a company that can afford to maintain Bazel? It’s incredible. Building code at Google was incredible. I can see why entire companies have spawned to provide consulting services or incremental technology improvements for Bazel. And I can see why ex-Googlers everywhere are torturing their new employers with the threat of Bazel4.
You can do some really cool things with Bazel. When I was at Etsy, the Java codebase that powered the search stack was tragically slow to work on. Builds took a while, and rerunning the test suite took forever. It noticeably impacted development speed.
An engineer — somehow not ex-FAANG — protoyped a rewrite Bazel. Of course, you need to do things The Bazel Way, so it took him a few weeks to break all of the circular dependencies that their previous build system was chill with, but Bazel absolutely COULD NOT accept. But he broke the circles one at a time, and eventually he had something that could compile.
The project was immediately greenlit when he demoed building the project and running their test suite. The build was already faster than it was before. And then he made a single change and reran the tests. Only the 2 tests that depended on the file rebuilt and reran. Instead of taking minutes to build and run the whole suite, it was over in a few seconds. After that, they went even further and used rules_k8s
to quickly push containerized builds to any of their clusters. It was super cool, and I later tried it myself and had zero problems doing this pattern. It just worked.
But inevitably, I ran into problems somewhere else.
A director at a mid-size tech company once told me, “I love hiring ex-FAANG engineers. Preferably when they’ve worked somewhere between FAANG and here.” I knew exactly what he meant.
Google tried recruiting me back a few years ago. They hadn’t reached out to me in a while, so I asked, “why now?” The recruiter admitted that it was disproportionately difficult to onboard Staff+ engineers into Google’s engineering culture because Google does things differently. So they were trialing interviewing ex-Googlers who had leveled up outside of the company.
This is a vast improvement over what would happen 10 years ago, which is that your project would be hopelessly broken every time you upgraded.
About once every 6 months, when we run into some random technical problem on our team, I suggest “I think this is the inflection point where we should use Bazel” just to troll my manager. He falls for it every time. I will not stop doing this.