What factors affect build performance?

Recently a customer asked me to help them create a list of factors that affect build performance. They found themselves often tasked with explaining to their developers why one build had worse performance than another, or with finding ways to further improve the performance of a build. This is a very big, very complex question — I think perhaps much more so than they realized at first! In fact I think the question as posed is fundamentally unanswerable: I could never give an exhaustive list of the factors that affect build performance. There are too many, and there are surely some that I myself have yet to see — “unknown unknowns” as they say.

Nevertheless, there is value in making a list, even if incomplete, if for no other reason then to serve as a reference for people trying to understand or improve the performance of their builds. What follows is my attempt at creating that list, roughly in order of importance — but bear in mind that this ordering is somewhat subjective, and highly situation-dependent: your mileage may vary, and different builds will have different specific bottlenecks.

Factors that affect build performance

  1. Build size

    Builds can be measured in many ways: number of output targets, number of input files, total lines of code, aggregate bytes of output generated, etc. Generally speaking, the bigger the build is, the longer it will take to complete. If your build is long simply because of its size, you may think you have no opportunities, but that’s not so: parallel builds, caching outputs, componentization and beefier hardware can all help cope with this type of problem.

  2. Execution parallelism

    Most build tools support some form of parallel execution — GNU make’s -j option is the classic example. Assuming there is parallelizable work in the build, the build that runs on more cores (that is, with a higher -j value) will complete more quickly.

  3. Available parallelism / build structure

    Running a build on many cores only helps if there is exploitable parallelism in the build. If the build is defined in such a way that parallelism is limited, then it will take longer to complete regardless of the -j value. For example, some builds may have unnecessary serializations in the dependency graph, which will limit performance.

  4. Caching

    The use of (or failure to use) caching technology, such as ElectricAccelerator’s JobCache, ClearCase winkins, or ccache, can dramatically impact build performance. In my tests caching such as JobCache can reduce build duration by 50% or more for full builds.

  5. Conflicts (ElectricAccelerator only)

    For builds executed with ElectricAccelerator, conflicts can have a significant impact on performance. Briefly, an Accelerator conflict is any time your build “loses” a race condition between two steps that should have been serialized but weren’t due to missing dependencies in your makefile or build files. Accelerator can detect and correct such errors on-the-fly, but it comes at a cost. A few conflicts is usually not a problem, but if you have hundreds or thousands, it will make your build slower as Accelerator reruns portions of the build to get the correct results. Usually if you see such a scenario it’s a sign that you didn’t have a complete Accelerator history file for your build, so fixing such issues is as easy as using the history file generated by that build to augment the dependency information in future runs of the build.

  6. Code complexity and structure

    There are many attributes of the code itself which can affect build performance. For example, as a general rule, very long source files will take longer to compile than shorter files. Files containing very long individual functions will take longer to compile than files with only short functions. Heavy use of templates in C++ will cause slower compilation. Careless use of #include statements in C or C++ code will cause slower compilation and can be especially harmful to incremental builds by triggering excessive recompilation.

  7. Implementation language

    Some languages are easier to process and therefore faster to compile. In general, C++ is slower to compile, while languages like Java or Go are very quick to compile. Some languages require no compilation at all, so builds of code using such languages can be very very fast indeed!

  8. Build tool

    There are a staggering array of build tools which you might choose to drive your build: GNU make, ninja, maven, ant, scons, emake, tup and more. Some were designed for high performance on full builds, while others were designed for high performance on incremental builds, and still others were designed for ease-of-use, correctness, or other non-performance related attributes. The choice of build tool will affect the performance of your build, especially if your build is very large.

  9. Compiler

    For compiled languages like C and C++ there are often many different compilers that you could choose from for your build: gcc/g++, clang, icc, WindRiver, Microsoft cl, tcc, etc. These tools themselves have different performance profiles, and the performance may even vary from one version of the compiler to the next.

  10. Compiler options

    For a given compiler, the build options you enable may significantly affect the compile time. For example, when using gcc, building with -O3 is generally slower than building with -O0. Therefore for developer builds, you may consider to disable optimizations in order to reduce build cycle time. Other options that may influence compile speed include: pre-compiled headers (PCH); dependency generation (-MD, -MMD, -MF, etc); profiling or coverage analysis (-fprofile-arcs or -ftest-coverage); and include path definitions (-I), which if very long can cause the compiler to spend excessive time searching for header files.

  11. Linker

    As with the compiler, different linkers have different performance characteristics. For C/C++ compilation on Linux the default linker is GNU ld, but there are alternatives like Google gold which have much better performance, albeit for a subset of the use cases supported by GNU ld. If your use case is supported by gold, you will likely see much better build performance by switching.

  12. Memory

    As with any process involving computers, the amount of available memory will have a significant impact on build performance. Too little and your system will swap excessively. Fortunately there’s no such thing as “too much”, though it may be prohibitively expensive to get so much RAM that you can stop worrying about it. In practice most builds do not require a huge amount of memory, but if yours do, and you don’t have enough, your build speed will suffer.

  13. Disk performance

    Like memory, the performance of your disk can significantly affect build performance. In fact its easier in some ways to understand the impact of disk speed. If the build generates 10GB of output and your disk can only write at 10MB/s, the fastest that the build can possibly finish is about 1,000 seconds, or nearly 20 minutes. On the other hand, if the build generates only 5MB of output and uses the same disk, then only 1/2 second is needed to write the build outputs, so the disk is unlikely to be a bottleneck. You may find that the disk is adequate for your builds now, but as the build gets bigger you will reach a point where the disk is no longer fast enough. At that point you can upgrade to a faster disk, and that will be sufficient for some time until again your build grows to exceed the capacity of the disk.

    Even if your disk is not a primary bottleneck now, switching to a faster disk may improve performance somewhat. Many users have had good results from switching to SSD for temporary storage, or using striped RAID for those builds that generate truly enormous amounts of data.

  14. Network performance

    For distributed builds such as those executed with ElectricAccelerator, network performance is crucial because build data has to be transferred across the network. But even if the build itself is not distributed, it may make use of tools pulled from a network file share, so the network performance can affect the build.

  15. Operating system / kernel version

    Some operating systems have better performance for builds than others — in general, I’ve found builds on Linux to be relatively faster than builds on Windows, for example. Likewise, some versions of the operating system may be faster than others. Some users have reported as much as a 3x improvement by upgrading from an old version of Linux to a newer version due to optimizations in the kernel itself.

  16. Anti-virus software

    Use of anti-virus software can dramatically impact build performance, particularly if the A/V is configured in one of the more aggressive or intrusive modes of operation: sometimes every file operation is intercepted by the A/V scanner, adding a substantial drag on build speed.

  17. License management

    Some build tools, such as certain commercial compilers, require licenses in order to operate. If the license system is misconfigured it can add delays to the build process, sometimes causing each compile to take minutes instead of seconds as the compiler tries and fails to contact a license server, or contacts the wrong license server instance (for example, one on a different subnet).

A foundation for performance investigations

So there you have it: my (not entirely) comprehensive list of factors that can affect build performance. Of course these won’t all be relevant for every build: every build is different, and each has a unique performance profile. A slow disk may be mostly irrelevant for one build but absolutely critical for another. My hope is that this list will serve as a foundation for your build performance investigations — something to help get you started, even if it doesn’t get you all the way to a conclusion.

What do you think of my list? What would you add, and how would you change the ordering? Let me know in the comments below.

What’s new in GNU make 4.2?

In May 2016 the GNU make team released GNU make 4.2. I’m pleased to see another release, though I find myself underwhelmed by both the timeline and the content of this release. When 4.1 came out just one year after 4.0 I hoped it was a sign that the GNU make project was switching to a more frequent and regular release cycle, as many software projects have done in the last several years. Although it can be a difficult adjustment this release cadence can have significant benefits like improving user engagement and reducing risk. But with the 4.2 release arriving nineteen long months after 4.1 it seems that GNU make has failed to make the transition.

Of course infrequent releases are not necessarily a problem, as long as the releases contain compelling new functionality. Unfortunately the new features in GNU make 4.2 are charitably described as “uninspiring” — though I’m sure each enhancement will be handy for the corner case it was designed to address. Of course GNU make is a mature project by any definition, and frankly it does what it does pretty well and has for a very long time — maybe it’s just “done”. But consider this: the past few years has seen something of an explosion in the build tool space, with several new build tools cropping up. Each of the following alternative build tools has had multiple releases in the last year, and each has innovative features that could be adopted by GNU make:

  • gradle, the default build tool for Android apps. Monthly releases. Reports and notifications.
  • bazel, the open-source version of Google’s internal build system. Ten releases already in 2016. Checksum-based up-to-date checks and minimization of test suite executions.
  • ninja, a make-like build tool. Two releases in the last twelve months. Resource pools and unbelievably fast parsing / low overhead.

So, what does GNU make 4.2 have to offer? Read on to see, and let me know in the comments if you disagree with my analysis.

.SHELLSTATUS variable

GNU make has had the $(shell) function for many years. This provides a mechanism by which you can get the result (stdout) of an arbitrary command into a make variable, where you can do whatever you like with it. One curious thing about $(shell) is that it doesn’t care at all whether the command you execute succeeds or not, so if you try to read a non-existent file, for example, with $(shell cat missing.txt), GNU make will simply return an empty string as the result of the shell invocation. In some cases you may want to actually check the exit status of that command, and in GNU make 4.2 you now can, by checking the value of .SHELLSTATUS, a new built-in variable that is automatically set to the exit code of the most recent $(shell) (or != assignment). Here’s a contrived example:

1
2
3
4
5
FOO := $(shell exit 1)
ifneq ($(.SHELLSTATUS),0)
$(error shell command failed!)
endif
all: ; @echo done

As you can see, it’s now possible to make your makefile react in whatever manner you deem appropriate when a shell invocation fails. Be advised, however: if you find yourself doing this, it may be an indication that your makefile is poorly written — almost every use of $(shell) is better handled by creating an actual rule to do whatever you were going to do with $(shell).

Read files with $(file)

The $(file) function was added to GNU make in the 4.0 release, in order to enable the creation of files directly from make — quite handy for those cases in which the content you want to write is so long it exceeds command-line length limits on your system. In 4.2 the $(file) function was extended so that you can use it to read files in addition to writing files. For example, SRCS := $(file <sourcelist.txt) would capture the content of the file sourcelist.txt in the variable SRCS, less the final newline in the file, if any (that last bit is for consistency with the $(shell) function).

Improved error reporting

GNU make 4.2 includes a small but very useful improvement in error reporting: previously when make encountered an error while executing a recipe, it would report only the name of the target being built, such as make: *** [all] Error 1. Starting with 4.2, this error message includes the makefile and line number of the specific command that produced the error: make: *** [Makefile:6: all] Error 1. This should make it much easier to debug large, complex builds — especially anything that uses double-colon rules to composite functionality from many fragments in distinct makefiles.

Bug fixes

In addition to the modest enhancements described above, the 4.2 release includes about three dozen other bug fixes. A glance at the resolution dates on those reveals that sometimes months passed with no updates. This makes me wonder why they didn’t cut a release at those points, even if it were just for bug fixes. My guess is that the project is trapped, in a sense: because the interval between releases is so long there’s a sense that each release has to be “perfect”, and because there’s an attempt to ensure each release is “perfect”, the interval between releases must be very long. Contrast this with a more agile approach, which can tolerate imperfect releases because the next release is just around the corner anyway. Combined with an ever expanding automated regression test suite it’s possible to gradually increase the bar for release quality, such that in fact the likelihood of a bad release goes down when compared with a project that has a long release cycle and is dependent on mostly manual testing.

GNU make isn’t going to go away any time soon, but I think the writing is on the wall: if it doesn’t start innovating again, developers will inevitably migrate to other build tools that do.

What’s new in GNU make 4.1?

October 2014 saw the release of GNU make 4.1. Although this release doesn’t have any really remarkable new features, the release is notable because it comes just one year after the 4.0 release — that’s the least time between releases in more than a decade. Hopefully, this is the start of a new era of more frequent, smaller releases for this venerable project which is one of the oldest still active projects in the GNU suite. Read on for notes about the new features in GNU make 4.1.

MAKE_TERMOUT and MAKE_TERMERR

Starting with 4.1, GNU make defines two additional variables: MAKE_TERMOUT and MAKE_TERMERR. These are set to non-empty values if make believes stdout/stderr is attached to a terminal (rather than a file). This enables users to solve a problem introduced by the output synchronization feature that was added in GNU make 4.0: when output synchronization is enabled, all child processes in fact write to a temporary file, even though in effect they are writing to the console. In other words, the implementation details of output synchronization may interfere with behaviors in child processes like output colorization which require a terminal for correct operation. If MAKE_TERMOUT or MAKE_TERMERR is set, then the user may explicitly direct such commands to maintain colorized output despite the fact that they appear to be writing to a file.

Enhanced $(file) function

The $(file) function was added in GNU make 4.0 to enable writing to files from a makefile without having to invoke a second process to do so. For example, where previously you had to do something like $(shell echo hello > myfile), now you can instead use $(file > myfile,foo). In theory this is more efficient, since it avoids creating another process, and it enables the user to easily write large blocks of text which would exceed command-line length limitations on some platforms.

In GNU make 4.1, the $(file) function has been enhanced such that the text to be written may be omitted from the function call. This allows $(file) to work as a sort of “poor man’s” replacement for touch, although having reviewed the bug report that resulted in this change, I think this is more an “enhancement of convenience” than a deliberate attempt to evolve the program. Of course I have to give a shout out to my friend Tim Murphy, who filed the bug report that led to this enhancement — nice work, Tim!

Relaxed constraints for mixing explicit and implicit rules

The final feature change in GNU make 4.1 is that make will emit a regular error rather than a fatal error (which terminates the build) when both explicit and pattern targets are specified as outputs of a rule, like this:

1
foo bar%: baz

This is an interesting change mostly for the high level of drama surrounding it. That bit of syntax is clearly illegal — in fact, if the pattern target is listed first rather than the explicit, GNU make has long identified this as invalid syntax, terminating the parse with *** mixed implicit and normal rules. Stop. Unfortunately, due to a defect in older versions of GNU make this construct is not prohibited when the explicit rule is named first.

In 3.82, the GNU make maintainers fixed the defect: whether or not the explicit target is named first, GNU make would identify the invalid syntax and terminate parsing. Everything was fine for about a year, and then? People flipped out. As it turns out, this construct is used by a prominant open source project: the Linux kernel. The offending syntax had been eliminated from the main development branch shortly after the 3.82 release, but third-party developers suddenly found themselves unable to build legacy versions of the kernel with the latest release of GNU make. A bug report was filed and generated 21 reponses, when the average GNU make bug report has only 3. Ultimately, the maintainers relented by reducing the severity to a non-fatal error for the 4.1 release — but with a stern message that this will likely become a fatal error again in a future release.

Bug fixes and thoughts

In addition to the bigger items identified above, the 4.1 release includes about two dozen other bug fixes. Overall, this release feels like a minor one — as often happens when release frequency increases, the individual releases become less interesting. From an agile/continuous delivery standpoint, that’s exactly what you want. But I’ve found that it is also difficult for a team that’s accustomed to less frequent releases with larger payloads to transition to smaller, more frequent releases while still incorporating large changes that take longer than one release to implement. Of course, one point does not make a line — that is, we can’t tell from this release alone whether the intention is to switch to a more frequent release cadence, or whether this release is an exception. If they are trying to increase the frequency, I think it will be very interesting to see how the GNU make development team adapts to the new cadence. Regardless, I’d like to congratulate the team for this release and I look forward to seeing what comes next.

HOWTO: Intro to GNU make variables

One thing that many GNU make users struggle with is predicting the value of a variable. And it’s no wonder, with the way make syntax freely mingles text intended for two very distinct phases of execution, and with two “flavors” of variables with very different semantics — except, that is, when the one seems to behave like the other. In this article I’ll run you through the fundamentals of GNU make variables so you can impress your friends (well, your nerdy friends, anyway) with your ability to predict the value of a GNU make variable at social gatherings.

Contents

Basics

Let’s start with the basics: a GNU make variable is simply a shorthand reference for another string of text. Using variables enables you to create more flexible makefiles that are also easier to read and modify. To create a variable, just assign a value to a name:

1
CFLAGS=-g -O2

Later, when GNU make sees a reference to the variable, it will replace the reference with the value of the variable — this is called expanding the variable. A variable reference is just the variable name wrapped in parenthesis or curly braces and prefixed with a dollar-sign. For example, this simple makefile will print “Hello, world!” by first assigning that text to a variable, then dereferencing the variable and using echo to print the variable’s value:

1
2
3
MSG = Hello, world!
all:
@echo $(MSG)

Creating variables

NAME = value is just one of many ways to create a variable. In fact there are at least eight ways to create a variable using GNU make syntax, plus there are built-in variables, command-line declarations, and of course environment variables. Here’s a rundown of the ways to create a GNU make variable:

  • MYVAR = abc creates the variable MYVAR if it does not exist, or changes its value if it does. Either way, after this statement is processed, the value of MYVAR will be abc.
  • MYVAR ?= def will create the variable MYVAR with the value def only if MYVAR does not already exist.
  • MYVAR += ghi will create the variable MYVAR with the value if MYVAR does not already exist, or it will append ghi to MYVAR if it does already exist.
  • MYVAR := jkl creates MYVAR if it does not exist, or changes its value if it does. This variation is just like the first, except that it creates a so-called simple variable, instead of a recursive variable — more on that in a minute.

In addition to the various assignment operators, you can create and modify variables using the define directive — handy if you want to create a variable with a multi-line value. Besides that, the define directive is equivalent to the normal VAR=VALUE assignment.

1
2
3
4
define MYVAR
abc
def
endef

If you’re using GNU make 3.82 or later, you can add assignment operators to the define directive to modify the intent. For example, to append a multi-line value to an existing variable:

1
2
3
4
define MYVAR +=
abc
def
endef

But there are still more ways to create variables in GNU make:

  • Environment variables are automatically created as GNU make variables when GNU is invoked.
  • Command-line definitions enable you to create variables at the time you invoke GNU make, like this: gmake MYVAR=123.
  • Built-in variables are automatically created when GNU make starts. For example, GNU make defines a variable named CC which contains the name of the default C compiler (cc) and another named CXX which contains the name of the default C++ compiler (g++).

Variable flavors

Now that you know how to create a GNU make variable and how to dereference one, consider what happens when you reference a variable while creating a second variable. Let’s use a few simple exercises to set the stage. For each, the answer is hidden on the line following the makefile. You can reveal the answer by highlighting the hidden text in your browser.

  1. Q1: What will this makefile print?
    1
    2
    3
    4
    ABC = Hello!
    MYVAR = $(ABC)
    all:
    @echo $(MYVAR)

    A1: Hello!

  2. Q2: What will this makefile print?
    1
    2
    3
    4
    5
    6
    ABC = Hello!
    MYVAR = $(ABC)
    all:
    @echo $(MYVAR)
    ABC = Goodbye!

    A2: Goodbye!

  3. Q3: What will this makefile print?
    1
    2
    3
    4
    5
    6
    ABC = Hello!
    MYVAR := $(ABC)
    all:
    @echo $(MYVAR)
    ABC = Goodbye!

    A3: Hello!

Don’t feel bad if you were surprised by some of the answers! This is one of the trickiest aspects of GNU make variables. To really understand the results, you have to wrap your brain around two core GNU make concepts. The first is that there are actually two different flavors of variables in GNU make: recursive, and simple. The difference between the two is in how GNU make handles variable references on the right-hand side of the variable assignment — for brevity I’ll call these “subordinate variables”:

  • With simple variables, subordinate variables are expanded immediately when the assignment is processed. References to subordinate variables are replaced with the value of the subordinate variable at the moment of the assignment. Simple variables are created when you use := in the assignment.
  • With recursive variables, expansion of subordinate variables is deferred until the variable named on the left-hand side of the assignment is itself referenced. That leads to some funny behaviors, because the value of the subordinate variables at the time of the assignment is irrelevant — in fact, the subordinate variables may not even exist at that point! What matters is the value of the subordinate variables when the LHS variable is expanded. Recursive variables are the default flavor, and they’re created when you use simply = in the assignment.

The second concept is that GNU make processes a makefile in two separate phases, and each phase processes only part of the text of the makefile. The first phase is parsing, during which GNU make interprets all of the text of the makefile that is outside of rule bodies. During parsing, rule bodies are not interpreted — only extracted for processing during the second phase: execution, or when GNU make actually starts running the commands to update targets. For purposes of this discussion, that means that the text in rule bodies is not expanded until after all the other text in the makefile has been processed, including variable assignments that physically appear after the rule bodies. In the following makefile, the text highlighted in green is processed during parsing; the text highlighted in blue is processed later, during execution. Again, to put a fine point on it: all of the green text is processed before any of the blue text:

1
2
3
4
5
6
ABC = Hello!
MYVAR = $(ABC)
all:
@echo $(MYVAR)
ABC = Goodbye!

Now the examples above should make sense. In Question 2, we created MYVAR as a recursive variable, which means the value of ABC at the time MYVAR is created doesn’t matter. By the time GNU make needs to expand MYVAR, the value of ABC has changed, so that’s what we see in the output.

In Question 3, we created MYVAR as a simple variable, so the value of ABC was captured immediately. Even though the value of ABC changes later, that change doesn’t affect the value of MYVAR.

Target-specific variables

Most variables in GNU make are global: that is, they are shared across all targets in the makefile and expanded the same way for all targets, subject to the rules outlined above. But GNU make also supports target-specific variables: variables given distinct values that are only used when expanding the recipe for a specific target (or its prerequisites).

Syntactically, target-specific variables look like a mashup of normal variable definitions, using =, :=, etc.; and prerequisite declarations. For example, foo: ABC = 123 creates a target-specific definition of ABC for the target foo. Even if ABC has already been defined as a global variable with a different value, this target-specific definition will take precedence when expanding the recipe for foo. Consider this simple makefile:

1
2
3
4
5
6
7
8
ABC = Hello!
all: foo bar
foo:
@echo $(ABC)
bar: ABC = Goodbye!
bar:
@echo $(ABC)

At first glance you might expect this makefile to print “Goodbye!” twice — after all, ABC is redefined with the value “Goodbye!” before the commands for foo are expanded. But because the redefinition is target-specific, it only applies to bar. Thus, this makefile will print one “Hello!” and one “Goodbye!”.

As noted, target-specific variables are inherited from a target to its prereqs — for example, the following makefile will print “Surprise!”, because bar inherits the target-specific value for ABC from foo:

1
2
3
4
5
6
ABC = Normal.
foo: ABC = Surprise!
foo: bar
bar:
@echo $(ABC)

You can do some neat tricks with this, but I urge you not to rely on the behavior, because it doesn’t always work the way you might think. In particular, if a target is listed as a prereq for multiple other targets, each of which have a different target-specific value for some variable, the actual value used for the prereq may vary depending on which files were out-of-date and the execution order of the targets. As a quick example, compare the output of the previous makefile when invoked with gmake foo and when invoked with gmake bar. In the latter case, the target-specific value from foo is never applied, because foo itself was not processed. With GNU make 3.82 or later, you can prevent this inheritence by using the private modifier, as in foo: private ABC = Surprise!.

Finally, note that target-specific variables may be applied to patterns. For example, a line reading %.o: ABC=123 creates a target-specific variable for all targets matching the pattern %.o.

Conclusion

If you’ve made it this far, you now know just about everything there is to know about GNU make variables. Congratulations! I hope this information will serve you well.

Questions or comments? Use the form below or hit me up on Twitter @emelski.

The Twelve Days of Christmas, GNU make style

Well, it’s Christmas Day in the States today, and while we’re all recovering from the gift-opening festivities, I thought this would be the perfect time for a bit of fun with GNU make. And what better subject matter than the classic Christmas carol “The Twelve Days of Christmas”? Its repetitive structure is perfect for demonstrating how to use several of GNU make’s built-in functions for iteration, selection and sorting. This simple makefile prints the complete lyrics to the song:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
L01=Twelve drummers drumming,
L02=Eleven pipers piping,
L03=Ten lords-a-leaping,
L04=Nine ladies dancing,
L05=Eight maids-a-milking,
L06=Seven swans-a-swimming,
L07=Six geese-a-laying,
L08=Five golden rings,
L09=Four calling birds,
L10=Three french hens,
L11=Two turtle doves, and
L12=A partridge in a pear tree!
LINES=12 11 10 09 08 07 06 05 04 03 02 01
DAYS=twelfth eleventh tenth ninth \
eighth seventh sixth fifth \
fourth third second first
$(foreach n,$(LINES),\
$(if $(X),$(info ),$(eval X=X))\
$(info On the $(word $n,$(DAYS)) day of Christmas,)\
$(info my true love gave to me)\
$(foreach line,$(wordlist $n,12,$(sort $(LINES))),\
$(info $(L$(line)))))
all: ; @:

By count, most of the lines here just declare variables, one for each item mentioned in the song. Note how the items are ordered: the last item added is given the lowest index. That means that to construct each verse we simply enumerate every item in the list, in order, starting with the new item in each verse.

Line 18 is where the real meat of the makefile begins. Here we use GNU make’s foreach function to iterate through the verses. $(foreach) takes three arguments: a name for the iteration variable, a space-separated list of words to assign to the iteration variable in turn, and a body of text to expand repeatedly, once for each word in the list. Here, the list of words is given by LINES, which lists the starting line for each verse, in order — that is, the first verse starts from line 12, the second from line 11, etc. The text to expand on each iteration is all the text on lines 19-23 of the makefile — note the use of backslashes to continue each line to the next.

Line 19 uses several functions to print a blank line before starting the next verse, if we’ve printed a verse already: the $(if) function, which expands its second argument if its first argument is non-empty, and its third argument if its first argument is empty; the $(info) function to print a blank line; and the $(eval) function to set the flag variable. The first time this line is expanded, X does not exist, so it expands to an empty string and the $(if) picks the “else” branch. After that, X has a value, so the $(if) picks the “then” branch.

Lines 20 and 21 again use $(info) to print output — this time the prelude for the verse, like “On the first day of Christmas, my true love gave to me”. The ordinal for each day is pulled from DAYS using the $(word) function, which extracts a specified word, given by its first argument, from the space-separated list given as its second argument. Here we’re using n, the iteration variable from our initial $(foreach) as the selector for $(word).

Line 22 uses $(foreach) again, this time to iterate through the lines in the current verse. We use line as the iteration variable. The list of words is given again by LINES except now we’re using $(sort) to reverse the order, and $(wordlist) to select a subset of the lines. $(wordlist) takes three arguments: the index of the first word in the list to select, the index of the last word to select, and a space-separated list of words to select from. The indices are one-based, not zero-based, and $(wordlist) returns all the words in the given range. The body of this $(foreach) is just line 23, which uses $(info) once more to print the current line of the current verse.

Line 25 has the last bit of funny business in this makefile. We have to include a make rule in the makefile, or GNU make will complain *** No targets. Stop. after printing the lyrics. If we simply declare a rule with no commands, like all:, GNU make will complain Nothing to be done for `all’.. Therefore, we define a rule with a single “no-op” command that uses the bash built-in “:” to do nothing, combined with GNU make’s @ prefix to suppress printing the command itself.

And that’s it! Now you’ve got some experience with several of the built-in functions in GNU make — not bad for a Christmas day lark:

  • $(eval) for dynamic interpretation of text as makefile content
  • $(foreach), for iteration
  • $(if), for conditional expansion
  • $(info), for printing output
  • $(sort), for sorting a list
  • $(word), for selecting a single word from a list
  • $(wordlist), for selecting a range of words from a list

Now — where’s that figgy pudding? Merry Christmas!

What’s new in GNU make 4.0?

After a little bit more than three years, the 4.0 release of GNU make finally arrived in October. This release packs in a bunch of improvements across many functional areas including debuggability and extensibility. Here’s my take on the most interesting new features.

Output synchronization

For the majority of users the most exciting new feature is output synchronization. When enabled, output synchronization ensures that the output of each job is kept distinct, even when the build is run in parallel. This is a tremendous boon to anybody that’s had the misfortune of having to diagnose a failure in a parallel build. This simple Makefile will help demonstrate the feature:

1
2
3
4
5
6
7
8
9
10
11
12
all: a b c
a:
@echo COMPILE a
@sleep 1 && echo a, part 1
@sleep 1 && echo a, part 2
@sleep 2 && echo a, part 3
b c:
@echo COMPILE $@
@sleep 1 && echo $@, part 1
@sleep 1 && echo $@, part 2
@sleep 1 && echo $@, part 3

Now compare the output when run serially, when run in parallel, and when run in parallel with –output-sync=target:

Serial Parallel Parallel with –output-sync=target
$ gmake
COMPILE a
a, part 1
a, part 2
a, part 3
COMPILE b
b, part 1
b, part 2
b, part 3
COMPILE c
c, part 1
c, part 2
c, part 3
$ gmake -j 4
COMPILE a
COMPILE b
COMPILE c
b, part 1
a, part 1
c, part 1
b, part 2
a, part 2
c, part 2
b, part 3
c, part 3
a, part 3
$ gmake -j 4 --output-sync=target
COMPILE c
c, part 1
c, part 2
c, part 3
COMPILE b
b, part 1
b, part 2
b, part 3
COMPILE a
a, part 1
a, part 2
a, part 3

Here you see the classic problem with parallel gmake build output logs: the output from each target is mixed up with the output from other targets. With output synchronization, the output from each target is kept separate, not intermingled. Slick! The output doesn’t match that of the serial build, unfortunately, but this is still a huge step forward in usability.

The provenance of this feature is especially interesting, because the idea can be traced directly back to me — in 2009, I wrote an article for CM Crossroads called Descrambling Parallel Build Logs. That article inspired David Boyce to submit a patch to GNU make in 2011 which was the first iteration of the –output-sync feature.

GNU Guile integration

The next major addition in GNU make 4.0 is GNU Guile integration, which makes it possible to invoke Guile code directly from within a makefile, via a new $(guile) built-in function. Naturally, since Guile is a general-purpose, high-level programming language, this allows for far more sophisticated computation from directly within your makefiles. Here’s an example that uses Guile to compute Fibonacci numbers — contrast with my Fibonacci in pure GNU make:

1
2
3
4
5
6
7
8
9
10
11
define FIBDEF
(define (fibonacci x)
(if (< x 2)
x
(+ (fibonacci (- x 1)) (fibonacci (- x 2)))))
#f
endef
$(guile $(FIBDEF))
%:
@echo $(guile (fibonacci $@))

Obviously, having a more expressive programming language available in makefiles will make it possible to do a great deal more with your make-based builds than ever before. Unfortunately I think the GNU make maintainers made a couple mistakes with this feature which will limit its use in practice. First, Guile was a poor choice. Although it’s a perfectly capable programming language, it’s not well-known or in wide use compared to other languages that they might have chosen — although you can find Scheme on the TIOBE Index, Guile itself doesn’t show up, and even though it is the official extension language of the GNU project, fewer than 25 of the GNU project’s 350 packages use Guile. If the intent was to embed a language that would be usable by a large number of developers, Python seems like the no-brainer option. Barring that for any reason, Lua seems to be the de facto standard for embedded programming languages thanks to its small footprint and short learning curve. Guile is just some weird also-ran.

Second, the make/Guile integration seem a bit rough. The difficulty arises from the fact that Guile has a rich type system, while make does not — everything in make is a string. Consequently, to return values from Guile code to make they must be converted to a string representation. For many data types — numbers, symbols and of course strings themselves — the conversion is obvious, and reversible. But for some data types, this integration does a lossy conversion which makes it impossible to recover the original value. Specifically, the Guile value for false, #f, is converted to an empty string, rendering it indistinguishable from an actual empty string return value. In addition, nested lists are flattened, so that (a b (c d) e) becomes a b c d e. Of course, depending on how you intend to use the data, each of these may be the right conversion. But that choice should be left to the user, so that we can retain the additional information if desired.

Loadable objects

The last big new feature in GNU make 4.0 is the ability to dynamically load binary objects into GNU make at runtime. In a nutshell, that load of jargon means that it’s possible for you to add your own “built-in” functions to GNU make, without having to modify and recompile GNU make itself. For example, you might implement an $(md5sum) function to compute a checksum, rather than using $(shell md5sum). Since these functions are written in C/C++ they should have excellent performance, and of course they can access the full spectrum of system facilities — file I/O, sockets, pipes, even other third-party libraries. Here’s a simple extension that creates a $(fibonacci) built-in function:

#include <stdio.h>
#include <gnumake.h>

int plugin_is_GPL_compatible;

int fibonacci(int n)
{
    if (n < 2) {
        return n;
    }
    return fibonacci(n - 1) + fibonacci(n - 2);
}

char *gm_fibonacci(const char *nm, unsigned int argc, char **argv)
{
    char *buf  = gmk_alloc(33);
    snprintf(buf, 32, "%d", fibonacci(atoi(argv[0])));
    return buf;
}

int fibonacci_gmk_setup ()
{
    gmk_add_function ("fibonacci", gm_fibonacci, 1, 1, 0);
    return 1;
}

And here’s how you would use it in a makefile:

1
2
3
load ./fibonacci.so
%:
@echo $(fibonacci $@)

I’m really excited about this feature. People have been asking for additional built-in functions for years — to handle arithmetic, file I/O, and other tasks — but for whatever reason the maintainers have been slow to respond. In theory, loadable modules will enable people to expand the set of built-in functions without requiring the approval or involvement of the core team. That’s great! I only wish that the maintainers had been more responsive when we invited them to collaborate on the design, so we might have come up with a design that would work with both GNU make and Electric Make, so that extension authors need only write one version of their code. Ah well — que sera, sera.

Other features

In addition to the major feature described above there are several other enhancements worth mentioning here:

  • ::= assignment, equivalent to := assignment, added for POSIX compatibility.
  • != assignment, which is basically a substitute for $(shell), added for BSD compatibility.
  • –trace command-line option, which causes GNU make to print commnds before execution, even if they would normally be suppressed by the @ prefix.
  • $(file …) built-in function, for writing text to a file.
  • GNU make development migrated from CVS to git.

You can find the full list of updates in the NEWS file in the GNU make source tree.

Looking ahead

It’s great to see continued innovation in GNU make. Remember, this is a tool that’s now 25 years old. How much of the software you wrote 25 years ago is still in use and still in active development? I’d like to offer a heartfelt congratulations to Paul Smith and the rest of the GNU make team for their accomplishments. I look forward to seeing what comes next!

What’s new in ElectricAccelerator 7.0

ElectricAccelerator 7.0 was officially released a couple weeks ago now, on April 12, 2013. This version, our 26th feature release in 11 years, incorporates performance features that are truly nothing less than revolutionary: dependency optimization and parse avoidance. To my knowledge, no other build tool in the world has comparable functionality, is working on comparable functionality or is even capable of adding such functionality. Together these features have enabled us to dramatically cut Android 4.1.1 (Jelly Bean) build times, compared to Accelerator 6.2:

  • Full, from-scratch builds are 35% faster
  • “No touch” incremental builds are an astonishing 89% faster

In fact, even on this highly optimized, parallel-friendly build, Accelerator 7.0 is faster than GNU make, on the same number of cores. On a 48-core system gmake -j 48 builds Android 4.1.1 in 15 minutes. Accelerator 7.0 on the same system? 12 minutes, 21 seconds: 17.5% faster.

Read on for more information about the key new features in ElectricAccelerator 7.0.

Dependency optimization: use only what you need

Dependency optimization is a new application of the data that is used to power Accelerator’s conflict detection and correction features. But where conflict detection is all about finding missing dependencies in makefiles, dependency optimization is focused on finding surplus dependencies, which drag down build performance by needlessly limiting parallelism. Here’s a simple example:

1
2
3
4
5
foo: bar
@echo abc > foo && sleep 10
bar:
@echo def > bar && sleep 10

In this makefile you can easily see that the dependency between foo and bar is superfluous. Unfortunately GNU make is shackled by the dependencies specified in the makefile and is thus obliged to run the two jobs serially. In contrast, with dependency optimization enabled emake can detect this inefficiency and ignore the unnecessary dependency — so foo and bar will run in parallel.

Obviously you could trivially fix this simple makefile, but in real-world builds that may be difficult or impossible to do manually. For example, in the Android 4.1.1 build, there are about 2 million explicitly specified dependencies in the makefiles. For a typical variant build, only about 300 thousand are really required: over 85% of the dependencies are unnecessary. And that's in the Android build, which is regarded by some as a paragon of parallel-build cleanliness — imagine the opportunities for improvement in builds that don't have Google's resources to devote to the problem.

To enable dependency optimization in your builds, add --emake-optimize-deps=1 to your emake command-line. The first build with that option enabled will "learn" the characteristics of the build; the second and subsequent builds will use that information to improve performance.

Parse avoidance: the fastest job is the one you don't have to do

A common complaint with large build systems is incremental build performance — specifically, the long lag between the time that the user invokes make and the time that make starts the first compile. Some have even gone so far as to invent entirely new build tools with a specific focus on this problem. Parse avoidance delivers similar performance gains without requiring the painful (perhaps impossible!) conversion to a new build tool. For example, a "no touch" incremental build of Android 4.1.1 takes close to 5 minutes with Accelerator 6.2, but only about 30 seconds with Accelerator 7.0.

On complex builds, a large portion of the lag comes from parsing makefiles. The net result of that effort is a dependency graph annotated with targets and the commands needed to generate them. The core idea underpinning parse avoidance is the realization that we need not redo that work on every build. Most of the time, the dependency graph, et al, is unchanged from one build to the next. Why not cache the result of the parse and reuse it in the next build? So that's what we did.

To enable parse avoidance in your builds, add --emake-parse-avoidance=1 to your emake command-line. The first build with that option will generate a parse result to add to the cache; the second and subsequent builds will reload the cached result in lieu of reparsing the makefiles from scratch.

Other goodies

In addition to the marquee features, Accelerator 7.0 includes dozens of other improvements. Here are some of the highlights:

  • Limited GNU make 3.82 support. emake now allows assignment modifiers (like ?=, etc.) on define-style variable definitions, when --emake-emulation=gmake3.82
  • Order-only prerequisites in NMAKE emulation mode. GNU make introduced the concept of order-only prerequisites in 3.80. With this release we've extended our NMAKE emulation with the same concept.
  • Enhancements to electrify. The biggest improvement is the ability to match full command-lines to decide whether or not a particular command should be executed remotely (Linux only). Previously, electrify could only match against the process name.

What's next?

In my opinion, Accelerator 7.0 is the most exciting release we've put out in close to two years, with truly ground-breaking new functionality and performance improvements. It's not often that you can legitimately claim double-digit percentage performance improvements in a mature product. I'm incredibly proud of my team for this accomplishment.

With that said: there's always room to do more. We're already gearing up for the next release. The exact release content is not yet nailed down, but on the short list of candidates is a new job scheduler, to enable still better performance; "buddy cluster" facilities, to allow the use of Accelerator without requiring dedicated hardware; and possibly some form of acceleration for Maven-based builds. Let's go!

#pragma multi and rules with multiple outputs in GNU make

Recently we released ElectricAccelerator 6.2, which introduced a new bit of makefile syntax — #pragma multi — which allows you to indicate that a single rule produces multiple outputs. Although this is a relatively minor enhancement, I’m really excited about it because this it represents a new direction for emake development: instead of waiting for the GNU make project to add syntactic features and then following some time later with our emulation, we’re adding features that GNU make doesn’t have — and hopefully they will have to follow us for a change!

Unfortunately I haven’t done a good job articulating the value of #pragma multi. Unless you’re a pretty hardcore makefile developer, you probably look at this and think, “So what?” So let’s take a look at the problem that #pragma multi solves, and why #pragma multi matters.

Rules with multiple outputs in GNU make

The problem we set out to solve is simply stated: how can you specify to GNU make that one rule produces two or more output files? The obvious — but wrong — answer is the following:

1
2
foo bar: baz
touch foo bar

Unfortunately, this fragment is interpreted by GNU make as declaring two rules, one for foo and one for bar — it just so happens that the command for each rule creates both files. That will do more-or-less the right thing if you run a from-scratch, serial build:

$ gmake foo bar
touch foo bar
gmake: `bar' is up to date.

By the time GNU make goes to update bar, it’s already up-to-date thanks to the execution of the rule for foo. But look what happens when you run this same build in parallel:

$ gmake -j 2 foo bar
touch foo bar
touch foo bar

Oops! — the files were updated twice. No big deal in this trivial example, but it’s not hard to imagine a build where running the commands to update a file twice would produce bogus output, particularly if those updates could be happening simultaneously.

So what’s a makefile developer to do? In standard GNU make syntax, there’s only one truly correct way to create a rule with multiple outputs: pattern rules:

1
2
%.x %.y: %.in
touch $*.x $*.y

In contrast with explicit rules, GNU make interprets this fragment as declaring a single rule that produces two output files. Sounds perfect, but there’s a significant limitation to this solution: all of the output files must share a common sequence in the filenames (called the stem in GNU make parlance). That is, if your rule produces foo.x and foo.y, then pattern rules will work for you because the outputs both have foo in their names.

If your output files do not adhere to that naming limitation, then pattern rules can’t help you. In that case, you’re pretty much out of luck: there is no way to correctly indicate to GNU make that a single rule produces multiple output files. There are a variety of hacks you can try to coerce GNU make to behave properly, but each has its own limitations. The most common is to nominate one of the targets as the “primary”, and declare that the others depend on that target:

1
2
3
bar: foo
foo: baz
touch foo bar

Watch what happens when you run this build serially from scratch:

$ gmake foo bar
touch foo bar
gmake: Nothing to be done for `bar'.

Not bad, other than the odd “nothing to be done” message. At least the files weren’t generated twice. How about running it in parallel, from scratch?

$ gmake -j 2 foo bar
touch foo bar
gmake: Nothing to be done for `bar'.

Awesome! We still have the odd “nothing to be done” message, but just as in the serial build, the command was only invoked one time. Problem solved? Nope. What happens in an incremental build? If you’re lucky, GNU make happens to do the right thing and regenerate the files. But in one incremental build scenario, GNU make utterly fails to do the right thing. Check out what happens if the secondary output is deleted, but the primary is not:

$ rm -f bar && gmake foo bar
gmake: `foo' is up to date.
gmake: Nothing to be done for `bar'.

That’s right: GNU make failed to regenerate bar. If you’re very familiar with the build system, you might realize what had happened and think to either delete foo as well, or touch baz so that foo appears out-of-date (which would cause the next run to regenerate both outputs). But more likely at this point you just throw your hands up and do a full clean rebuild.

Note that all of the alternatives in vanilla GNU make have similar deficiencies. This kind of nonsense is why incremental builds have a bad reputation. This is why we created #pragma multi.

Rules with multiple outputs in Electric Make

By default Electric Make emulates GNU make, so it inherits all of GNU make’s limitations regarding rules with multiple outputs — with one critical exception. Even when running a build in parallel, Electric Make ensures that the output matches that produced by a serial GNU make build, which means that even the original, naive attempt will “work” for full builds regardless of whether the build is serial (single agent) or parallel (multiple agents).

Given that foundation, why did we bother with #pragma multi? There are a couple reasons:

  1. Correct incremental builds: with #pragma multi you can correctly articulate the relationships between inputs and outputs and thus ensure that all the outputs get rebuilt in incremental builds, rather than using kludges and hoping for the best.
  2. Out-of-the-box performance: although Electric Make guarantees correct output of the build, if you don’t have an up-to-date history file for the build you may waste time and compute resources running commands that don’t need to be run (work that will eventually be discarded when Electric Make detects the error). In the examples shown here the cost is negligible, but in real builds it could be significant.

Using #pragma multi is easy: just add the directive before the rule that will generate multiple outputs:

1
2
3
#pragma multi
foo bar: baz
touch foo bar

Watch what happens when this makefile is executed with Electric Make:

$ emake foo bar
touch foo bar

Note that there is no odd “is up to date” or “nothing to be done” message for bar — because Electric Make understands that both outputs are created by a single rule. Let’s verify that the build works as desired in the tricky incremental case that foiled GNU make — deleting bar without deleting foo:

$ rm -f bar && emake foo bar
touch foo bar

As expected, both outputs are regenerated: even though foo existed, bar did not, so the commands were executed.

Summary: rules with multiple outputs

Let’s do a quick review of the strategies for creating rules with multiple outputs. For simplicity we can group them into three categories:

  • #pragma multi
  • The naive approach, which does not actually create a single rule with multiple outputs at all.
  • Any of the various hacks for approximating rules with multiple outputs.

Here’s how each strategy fares across a variety of build modes:

Electric Make GNU make
Full (serial) Full (parallel) Incremental Full (serial) Full (parallel) Incremental
#pragma multi N/A
Naive
Hacks


The table paints a grim picture for GNU make: there is no way to implement rules with multiple outputs using standard GNU make which reliably gives both correct results and good performance across all types of builds. The naive approach generates the output files correctly in serial builds, but may fail in parallel builds. The various hacks work for full builds, but may fail in incremental builds. Even in cases where the output files are generated correctly, the build is marred by spurious “is up to date” or “nothing to be done for” messages — which is why most of the entries in the GNU make side are yellow rather than green.

In contrast, #pragma multi allows you to correctly generate multiple outputs from a single rule, for both full and incremental builds, in serial and in parallel. The naive approach also “works” with Electric Make, in that it will produce correct output files, but like GNU make the build is cluttered with spurious warnings. And, unless you have a good history file, the naive approach can trigger conflicts which may negatively impact build performance. Finally, despite its sophisticated conflict detection and correction smarts, even Electric Make cannot ensure correct incremental builds when you’ve implemented one of the multiple output hacks.

So there you have it. This is why we created #pragma multi: without it, there’s just no way to get the job done quickly and reliably. You should give ElectricAccelerator a try.

try_eade_button2

What’s new in ElectricAccelerator 6.2?

We released ElectricAccelerator 6.2 a couples weeks ago, our 25th feature release. 6.2 was a quick interim release primarily intended to address a couple long-standing stability issues, but we managed to squeeze in some really interesting feature enhancements as well. Here’s what’s new:

Rules with multiple outputs? Yeah, we can do that.

Every now and then, makefile authors need to write a single makefile rule that produces more than one output file, to accomodate tools that don’t fit gmake’s rigid one-command-one-output model. The classic example is bison, which produces both a C file and a header file from a single invocation of the tool.

Unfortunately in regular gmake the only way to write a rule with multiple outputs is to use a pattern rule. That’s great — if your needs happens to dovetail with the caveats and limitations of pattern rules (chiefly, that the output files share a common base name). If not, the answer has been basically that you’re out of luck. There are a variety of kludges that approximate the behavior, but despite numerous requests over the last decade (1, 2, 3, 4, 5, 6, 7, 8) and at least one patch implementing the feature, GNU make (as of 3.82) still has no way to create an explicit rule that produces multiple outputs.

When it comes to enhancements to the fundamental operation of GNU make, we’ve historically let the GNU make team take the lead, rather than risk introducing potentially incompatible changes. But after so many years it seems clear that this feature is not going to show up in GNU make — so we decided to forge ahead on our own. Enter #pragma multi:

1
2
3
#pragma multi
foo bar:
@touch foo bar

GNU make interprets this construct as two independent rules, one for foo and one for bar, which happen to each create both files. Thanks to the #pragma multi designation, Electric Make will interpret this as a single rule which produces both foo and bar. Using a #pragma to flag the rule is perfect, because it sidesteps any questions about syntax changes. And since #pragma starts with a #, GNU make will treat it as a comment, so this makefile will still be usable with GNU make — you’ll just get correct behavior and better performance with Electric Make.

New platforms and a faster installer

Accelerator 6.2 adds support for Linux kernels up to 3.5.x, which means that Accelerator now supports the following platforms:

  • Ubuntu 11.10
  • Ubuntu 12.04
  • SUSE Linux Enterprise Server 11 SP2

In addition, Accelerator 6.2 is expected to work correctly on both Ubuntu 12.10 and Windows 8, although we cannot officially claim support for those platforms since they were themselves not finalized at the time Accelerator 6.2 was released. This release also incorporates enhancements to the Linux installer which make the installation process about 25% faster compared to previous releases.

A complete list of platforms supported by ElectricAccelerator 6.2 can be found in the Electric Cloud Knowledge Base.

Key robustness improvements

Raise your hand if you’ve ever seen this error on your Linux Accelerator agent hosts:

unable to unmount EFS at “/some/path”: EBUSY

That error shows up sometimes when your build starts background processes — kind of a distributed build anti-pattern itself, but unfortunately it’s not always something you can control thanks to some third-party toolchains. Or rather, that error used to show up sometimes, because in Accelerator 6.2 we’ve bulletproofed the system against such rogue background processes, so that error is a thing of the past (nota bene: this enhancement is not available on Solaris).

In addition, we bulletproofed the system against external processes (any process running on an agent host which is not part of your build) accessing the EFS. In certain rare circumstances, such accesses could lead to agent host instability.

What’s next?

With 6.2 out the door we’ve finally got bandwidth to work on 7.0, which will focus on some very exciting performance improvements, especially for incremental builds. It’s a little bit too early to share any of the preliminary results we’re seeing, but rest assured — if you thought Accelerator was fast before, well… you ain’t seen nothing yet! Stay tuned for more information.

ElectricAccelerator 6.2 is available immediately. If you are already an Accelerator user, contact support@electric-cloud.com to upgrade. If you are not currently a user, you can download a free evaluation version of ElectricAccelerator Developer Edition, or contact sales@electric-cloud.com.

Fixing recursive make

Recursive make is one of those things that everybody loves to hate. It’s even been the subject of one of those tired “… Considered Harmful” diatribes. According to popular opinion, recursive make will sap performance from your build, make it nigh impossible to ensure correctness in parallel builds, and may render the user sterile. OK, maybe not that last one. But seriously, the arguments against recursive make are legion, and deeply entrenched. The problem? They’re flawed. That’s because they assume there’s only one way to implement recursive make — when the submake is invoked, the parent make is blocked until the submake completes. That’s how almost everybody does it. But in Electric Make, part of ElectricAccelerator, we developed a novel new approach called non-blocking recursive make. This design eliminates the biggest problems attributed to recursive make, without requiring a painful and costly conversion of your build system to non-recursive make.

The problem with traditional recursive make

There’s really just two problems at the heart of complaints with traditional recursive make: first, there’s no way to ensure correctness of a parallel recursive make based build without overserializing the submakes, because there’s no way to articulate dependencies between individual targets in different submakes. That means you can’t have a dependency graph that is both correct and precise. Instead you either leave out the critical dependency entirely, which makes parallel (ie, fast) builds unreliable; or you serialize submakes in their entirety, which shackles build performance because no part of a submake with even a single dependency on some portion of an earlier submake can begin until the entire ealier submake completes. Second, even if there were a way to specify precise dependencies between targets in different submakes, most versions of make have implemented recursive make such that the parent make is blocked from proceeding until the submake has completed. Consider a typical use of recursive make with implicit serializations between submakes:

1
2
3
4
all:
@for dir in util client server ; do \
$(MAKE) -C $$dir; \
done

Each submake compiles a bunch of source files, then links them together into a library (util) or an executable (client and server). The only actual dependency between the work in the three make instances is that the client and server programs need the util library. Everything else is parallelizable, but with traditional recursive make, gmake is unable to exploit that parallelism: all of the work in the util submake must finish before any part of the client submake begins!

Conflict detection and non-blocking recursive make

If you’re familiar with Electric Make, you already know how it solves the first half of the recursive make problem: conflict detection and correction. I’ve written about conflict detection before, but here’s a quick recap: using the explicit dependencies given in the makefiles and information about the files accessed as each target is built, emake is able to dynamically determine when targets have been built too early due to missing explicit dependencies, and rerun those targets to generate the correct output. Electric Make can ensure the correctness of parallel builds even in the face of incomplete dependencies, even if the missing dependencies are between targets in different submakes. That means you need not serialize entire submakes to ensure the build will run correctly in parallel.

Like an acrobat’s safety net, conflict detection allows us to consider solutions to the other half of the problem that would otherwise be considered risky, if not outright madness. In fact, our solution would not be possible without conflict detection: non-blocking recursive make. This is analogous to the difference between blocking and non-blocking I/O: rather than waiting for a recursive make to finish, emake carries on executing subsequent commands in the build immediately, including other recursive makes. Conflict detection ensures that only the commands in each submake which require serialization are executed sequentially, so the build runs as quickly as possible, but the final build output is identical to a serial build.

The impact of this change is dramatic. Here I’ve plotted the execution of the simple build defined above on four cores, using both gmake (normal recursive make) and emake (non-blocking recursive make):

Recursive make build with gmake


Recursive make build with emake

Electric Make is able to execute this build about 20% faster than gmake, with no changes to the Makefiles or the execution environment. emake is literally able to squeeze more parallelism out of recursive-make-based builds than gmake. In fact, we can precisely quantify just how much more parallelism emake gets through an application of Amdahl’s law. First, we compute the best possible speedup for the build — that’s just the serial runtime divided by the best possible parallel runtime, which we can figure out through analysis of the depedency graph and runtime of individual jobs in the build (the Longest Serial Chain report in ElectricInsight can do this for you). Then we can compute the parallelizable portion P of the build by plugging the speedup S into this equation: P = 1 – (1 / S). Here’s how that works out for gmake and emake:

gmake emake
Serial baseline 65s 65s
Best build time 13.5s 7.5s
Best speedup 4.8x 8.7x
Parallel portion 79% 89%

On this build, non-blocking recursive make increases the parallel portion of the build by 10%. That may not seem like much, but Amdahl’s law shows how dramatically that difference affects the speedup you can expect as you apply more cores:

Implementation

On the backend, non-blocking recursive make is handled by conflict detection — the jobs from the recursive make are checked for conflicts in the serial order defined by the makefile structure. Any issues caused by aggressively running recursive makes early are detected during the conflict check, and the target that ran too early is rerun to generate the correct result.

On the frontend, emake uses a strategy that is at once both brilliant in its simplicity, and diabolical in its trickery. It starts with an environment variable. When emake is invoked recursively, it checks the value of EMAKE_BUILD_MODE. If it is set to node, emake runs in so-called stub mode: rather than executing the submake (parsing the makefile and building targets), emake captures the invocation context (working directory, command-line and environment) in a file on disk, prints a “magic” string and exits with a zero status code.

The file containing the invocation context is identified by a second environment variable, ECLOUD_RECURSIVE_COMMAND_FILE. The Accelerator agent (which handles invoking commands on behalf of emake) checks for the presence of that file after every command that is run. If it is found, the agent relays the content to the toplevel emake invocation, where a new make instance is created to represent the submake invocation. That instance comes with it’s own parse job of course, which gets inserted into the queue of jobs. Some (short) time later, the parse job will run, discover whatever work must be run by the submake, and create additional rule jobs.

The magic string — EMAKE_FNORD — serves as a placeholder in the stdout stream for the jobs, so emake can figure out which portion of the output text comes before and which portion comes after the submake. This ensures that the build output log is identical to that generated by a serialized gmake build. For example, given the following rule that invokes a submake, you’d expect to see the “Before” and “After” messages printed before and after the output generated by commands in the submake itself:

1
2
3
4
all:
@echo Before util ; \
@$(MAKE) -C util ; \
@echo After util

With non-blocking recursive make, the submake has not actually executed when the “echo After util” command runs. If emake doesn’t account for that reordering, both the “Before” and “After” messages will appear before any of the output from the submake. EMAKE_FNORD allows emake to “stitch” the output together so the build log matches a serial log.

Limitations

Conflict detection and non-blocking recursive make together solve the main problems associated with recursive make. But there are a couple scenarios where non-blocking recursive make does not work well. Fortunately, these are uncommon in practice and easily addressed.

Capturing recursive make stdout

The first scenario is when the build captures the output of the recursive make invocation, rather than letting it print to stdout as normal. Since emake defers the execution of the submake and prints only EMAKE_FNORD to stdout, this will not work. There are two reasons you might do this: first, you might want to have separate build logs for each submake, to simplify error detection and management. In this situation, the simplest workaround is to remove the redirection and instead us emake’s annotated build log, an XML version of the build output log which can be easily processed using standard tools. Second, you may be using make as a text-processing tool (sort of a “poor man’s” Perl), rather than for building per se:

1
2
3
all:
@$(MAKE) -f genlist.mk > objects.txt
@cat objects.txt | xargs rm

In this case, the workaround is to explicitly force emake to run in so-called “local” mode, which means emake will handle the recursive make invocation as a blocking invocation, just like traditional make would. You can force emake into local mode by adding EMAKE_BUILD_MODE=local to the environment before the recursive make invocation.

Immediate consumption of build products

The second scenario is when the build consumes the product of the submake in the same command that contains the invocation. For example:

1
2
all:
@$(MAKE) -C sub foo && cp sub/foo ./foo

Here the build assumes that the output files generated by the submake will be available for use immediately after the submake completes. Obviously this is not the case with non-blocking recursive make — when the invocation of $(MAKE) -C sub foo completes, only the submake stub has actually finished. The build products will not be available until after the submake is actually processed later. Note that in this build both the recursive make invocation and the commands that use the build products from that invocation are treated as a single command from the perspective of make: make actually invokes the shell, and the shell then runs the recursive make and cp commands.

The workaround is simple: split the consumer into a distinct command, from the perspective of make:

1
2
3
all:
@$(MAKE) -C sub foo
@cp sub/foo ./foo

With that trivial change, emake is able to treat the cp as a continuation job, which can be serialized against the completion of the recursive make as needed.

A fix for recursive make

For years, people have heaped scorn and criticism on recursive make. They’ve nearly convinced everybody that even considering its use is automatically wrong — you probably can’t help feeling a little bit guilty when you use recursive make. But the reality is that recursive make is a reasonable way to structure a large build. You just need a better make. With conflict detection and non-blocking recursive make, Electric Make has fixed the problems usually associated with recursive make, so you can get parallel builds that are both fast and correct. Give it a try!