A while back, I did a series of posts exploring the performance of SCons on builds of various sizes. The results were dismal: SCons demonstrated a classic O(n2) growth in runtime, meaning that the length of the build grew in proportion to the square of the number of files in the build, rather than linearly as one would hope. Naturally, that investigation and its results provoked a great deal of discussion at the time and since. Typically, SCons advocates fall back on one particular argument: “Sure, SCons may be slow,” they say, “but that’s the price you pay for a correct build.” Recently, Eric S. Raymond wrote an article espousing this same fundamental argument, with the addition of some algorithmic analysis intended to prove mathematically that a correct build, regardless of the build tool, must necessarily exhibit O(n2) behavior — a clever bit of circular logic, because it implies that any build tool that does not have such abyssmal performance must not produce correct builds!
Naturally, after spending nearly a decade developing a high-performance replacement for GNU make, I couldn’t let that statement stand. This post is probably going to be on the long side, so here’s the tl;dr summary:
- You can guarantee correct builds with make, provided you follow best practices.
- The worst-case runtime of any build tool if, of course, O(n2), but most, if not all, builds can be handled in O(n) time, without sacrificing correctness.
- SCons’ performance problem is caused by design and implementation decisions in SCons, not some pathology of build structure.
What is required to ensure a correct build?
One of the fundamental tenents of the pro-SCons mythos is the idea that it is unique in its ability to guarantee correct builds. In reality, SCons is not doing anything particularly special in this regard. It’s true that by virtue of its design SCons makes it easier to get it right, but there’s nothing keeping you from enjoying the same assurances in make.
First: what is a correct build? Simply put, a correct build is one in which everything that ought to be built, is built. Note that by definition, a from-scratch build is correct, since everything is built in that case. So the question of “correct” or “incorrect” is really only relevant in regards to incremental builds.
So, what do we need in order to ensure a correct incremental build? Only three things, actually:
- A single, full-build dependency graph.
- Complete dependency information for every generated file.
- A reliable way to determine if a file is up-to-date relative to its inputs.
What SCons has done is made it more-or-less impossible, by design, to not have these three things. There is no concept like recursive make in the SCons world, so the only option is a single, full-build dependency graph. Likewise, SCons automatically scans input files in several programming languages to find dependency information. Finally, SCons uses MD5 checksums for the up-to-date check, which is a pretty darn reliable way to verify whether a given file needs to be rebuilt.
But the truth is, you can guarantee correct builds with make — you just have to adhere to long-standing best practices for make. First, you have to avoid using recursive make. Then, you need to add automatic dependency generation. The only thing that’s a little tricky is the up-to-date check: make is hardwired to use file timestamps, which can be spoofed, deliberately or accidentally — although to be fair, in most cases, timestamps are perfectly adequate. But even here, there’s a way out. You can use a smarter version of make that has a more sophisticated up-to-date mechanism, like ElectricMake or ClearMake. You can even shoehorn MD5 checksums into GNU make, if you like.
I can’t deny that SCons has made it easier to get correct builds. But the notion that it can’t be done with make is simply absurd.
What is the cost of a correct build?
Now we turn to the question of the cost of ensuring correctness. At its core, any build tool is just a collection of graph algorithms — first contructing the dependency graph, then traversing it to find and update out-of-date files. These algorithms have well-understood complexity, typically given as O(n + e), where n is the number of nodes in the graph, and e is the number of edges. It turns out that e is actually the dominant factor here, since it is at least equal to n, and at worst as much as n2. That means we can simplify the complexity to O(n + n2), or just O(n2).
Does this absolve SCons of its performance sins? Unfortunately it does not, because O(n2) is a worst-case bound — you should only expect O(n2) behavior if you’ve got a build that has dependencies between every pair of files. Think about that for a second. A dependency between every. pair. of. files. Here’s what that would look like in makefile syntax:
all: foo bar foo.c bar.c foo.h bar.h foo: bar foo.c bar.c foo.h bar.h bar: foo.c bar.c foo.h bar.h foo.c: bar.c foo.h bar.h bar.c: foo.h bar.h foo.h: bar.h
It’s ridiculous, right? I don’t know about you, but I’ve certainly never seen a build that does anything even remotely like that. In particular, the builds I used in my benchmarks don’t look like that. Fortunately, those builds are small and simple enough that we can directly count the number of edges in the dependency graph. For example, the smallest build in my tests consisted of:
So we have about 6,000 nodes in the graph, but how many edges does the graph contain? Lucky for us, SCons will print the complete dependency graph if we invoke it with scons –tree=all:
+-. +-SConstruct +-d1_0 | +-d1_0/SConstruct | +-d1_0/f00000_sconsbld_d1_0 | | +-d1_0/f00000_sconsbld_d1_0.o | | | +-d1_0/f00000_sconsbld_d1_0.c | | | +-d1_0/lup001_sconsbld_d1_0/f00000_sconsbld_d1_0.h ...
The raw listing contains about 35,000 lines of text, but that includes duplicates and non-dependency information like filesystem structure. Filter that stuff out and you can see the graph contains only about 12,000 dependencies. That’s a far cry from the 1,800,000 or so you would expect if this truly were a “worst-case” build. It’s clear, in fact, that the number of edges is best described as O(n).
Although I don’t know how (or even if it’s possible) to prove that this is the general case, it does make a certain intuitive sense: far from being strongly-connected, most of the nodes in a build dependency graph have just one or two edges. Each C source file, for example, has just one outgoing edge, to the object file generated from that source. Each object file has just one outgoing edge too, to the library or executable the object is part of. Sure, libraries and headers probably have more edges, since they are used by multiple executables or objects, but the majority of the stuff in the graph is going to fall into the “small handful of edges” category.
Now, here’s the $64,000 question: if the algorithms in a build tool scale in proportion to the number of edges in the dependency graph, and we’ve just shown that the dependency graph in question has O(n) edges, why does SCons use O(n2) time to execute the build?
Why is SCons so slow?
SCons’ O(n2) performance stems from its graph traversal implementation. Essentially, SCons scans the entire dependency graph each time it is looking for a file to update. n scans of a graph with O(n) nodes and edges equals an O(n2) graph traversal. There’s no mystery here. In fact, the SCons developers are clearly aware of this deficiency, as described on their wiki:
It’s worth noting that the Jobs module calls the Taskmaster once for each node to be processed (i.e., it’s O(n)) and the Taskmaster has an amortized performance of O(n) each time it’s called. Thus, the overall time is O(n^2).
But despite recognizing this flaw, they severely misjudged its impact, because they go on to state that it requires a “pathological” dependency graph in order to elicit this worst-case behavior from SCons. As we’ve shown here and in previous posts, even a terribly mundane dependency graph elicits O(n2) behavior from SCons. I shudder to think what SCons would do with a truly pathological dependency graph!
Obviously the next question is: why does SCons do this? That’s not quite as easy for me to explain, as an outside observer. To the best of my understanding, they rescan the graph just in case new dependencies are added to the dependency graph while evaluating a node in the graph — remember, in SCons the commands to update a file are expressed in Python, so they can easily manipulate the dependency graph even while the build is running.
Is it really necessary to rescan the dependency graph over and over? I don’t think so. In fact, make is proof that it is not necessary. I think there are two ways that SCons could address this problem: first, it could adopt GNU make’s convention of partitioning the build into distinct phases, one that updates dependency information, and a second that actually executes the build. In GNU make, that strategy allows for the introduction of new dependency information, while imposing only a one-time O(n) cost for restarting the make process if any new dependencies are found.
Alternatively, SCons could probably be made smarter about when a full rescan is required. Most of the time, even if new dependencies are added to the graph, they are added to the node being evaluated, not to nodes that were already visited. That is, when you scan a source file for implicit dependencies, you find the dependencies for that file not for other files in the build (duh). So most of the time, a full rescan is massive overkill.
The final word…?
Hopefully this is my last post on the subject of SCons performance. It is clear to me that SCons does not scale to large projects, and that the problem stems from design and implementation decisions in SCons, rather than some pathology in the build itself. You can get comparable guarantees of correctness from make, if you’re willing to invest the time to do things the right way. The payoff is a build system that is not only correct but has vastly better performance than SCons as your project grows. Why wouldn’t you want that?