post

#pragma multi and rules with multiple outputs in GNU make

Recently we released ElectricAccelerator 6.2, which introduced a new bit of makefile syntax — #pragma multi — which allows you to indicate that a single rule produces multiple outputs. Although this is a relatively minor enhancement, I’m really excited about it because this it represents a new direction for emake development: instead of waiting for the GNU make project to add syntactic features and then following some time later with our emulation, we’re adding features that GNU make doesn’t have — and hopefully they will have to follow us for a change!

Unfortunately I haven’t done a good job articulating the value of #pragma multi. Unless you’re a pretty hardcore makefile developer, you probably look at this and think, “So what?” So let’s take a look at the problem that #pragma multi solves, and why #pragma multi matters.

Rules with multiple outputs in GNU make

The problem we set out to solve is simply stated: how can you specify to GNU make that one rule produces two or more output files? The obvious — but wrong — answer is the following:

1
2
foo bar: baz
touch foo bar

Unfortunately, this fragment is interpreted by GNU make as declaring two rules, one for foo and one for bar — it just so happens that the command for each rule creates both files. That will do more-or-less the right thing if you run a from-scratch, serial build:

$ gmake foo bar
touch foo bar
gmake: `bar' is up to date.

By the time GNU make goes to update bar, it’s already up-to-date thanks to the execution of the rule for foo. But look what happens when you run this same build in parallel:

$ gmake -j 2 foo bar
touch foo bar
touch foo bar

Oops! — the files were updated twice. No big deal in this trivial example, but it’s not hard to imagine a build where running the commands to update a file twice would produce bogus output, particularly if those updates could be happening simultaneously.

So what’s a makefile developer to do? In standard GNU make syntax, there’s only one truly correct way to create a rule with multiple outputs: pattern rules:

1
2
%.x %.y: %.in
touch $*.x $*.y

In contrast with explicit rules, GNU make interprets this fragment as declaring a single rule that produces two output files. Sounds perfect, but there’s a significant limitation to this solution: all of the output files must share a common sequence in the filenames (called the stem in GNU make parlance). That is, if your rule produces foo.x and foo.y, then pattern rules will work for you because the outputs both have foo in their names.

If your output files do not adhere to that naming limitation, then pattern rules can’t help you. In that case, you’re pretty much out of luck: there is no way to correctly indicate to GNU make that a single rule produces multiple output files. There are a variety of hacks you can try to coerce GNU make to behave properly, but each has its own limitations. The most common is to nominate one of the targets as the “primary”, and declare that the others depend on that target:

1
2
3
bar: foo
foo: baz
touch foo bar

Watch what happens when you run this build serially from scratch:

$ gmake foo bar
touch foo bar
gmake: Nothing to be done for `bar'.

Not bad, other than the odd “nothing to be done” message. At least the files weren’t generated twice. How about running it in parallel, from scratch?

$ gmake -j 2 foo bar
touch foo bar
gmake: Nothing to be done for `bar'.

Awesome! We still have the odd “nothing to be done” message, but just as in the serial build, the command was only invoked one time. Problem solved? Nope. What happens in an incremental build? If you’re lucky, GNU make happens to do the right thing and regenerate the files. But in one incremental build scenario, GNU make utterly fails to do the right thing. Check out what happens if the secondary output is deleted, but the primary is not:

$ rm -f bar && gmake foo bar
gmake: `foo' is up to date.
gmake: Nothing to be done for `bar'.

That’s right: GNU make failed to regenerate bar. If you’re very familiar with the build system, you might realize what had happened and think to either delete foo as well, or touch baz so that foo appears out-of-date (which would cause the next run to regenerate both outputs). But more likely at this point you just throw your hands up and do a full clean rebuild.

Note that all of the alternatives in vanilla GNU make have similar deficiencies. This kind of nonsense is why incremental builds have a bad reputation. This is why we created #pragma multi.

Rules with multiple outputs in Electric Make

By default Electric Make emulates GNU make, so it inherits all of GNU make’s limitations regarding rules with multiple outputs — with one critical exception. Even when running a build in parallel, Electric Make ensures that the output matches that produced by a serial GNU make build, which means that even the original, naive attempt will “work” for full builds regardless of whether the build is serial (single agent) or parallel (multiple agents).

Given that foundation, why did we bother with #pragma multi? There are a couple reasons:

  1. Correct incremental builds: with #pragma multi you can correctly articulate the relationships between inputs and outputs and thus ensure that all the outputs get rebuilt in incremental builds, rather than using kludges and hoping for the best.
  2. Out-of-the-box performance: although Electric Make guarantees correct output of the build, if you don’t have an up-to-date history file for the build you may waste time and compute resources running commands that don’t need to be run (work that will eventually be discarded when Electric Make detects the error). In the examples shown here the cost is negligible, but in real builds it could be significant.

Using #pragma multi is easy: just add the directive before the rule that will generate multiple outputs:

1
2
3
#pragma multi
foo bar: baz
touch foo bar

Watch what happens when this makefile is executed with Electric Make:

$ emake foo bar
touch foo bar

Note that there is no odd “is up to date” or “nothing to be done” message for bar — because Electric Make understands that both outputs are created by a single rule. Let’s verify that the build works as desired in the tricky incremental case that foiled GNU make — deleting bar without deleting foo:

$ rm -f bar && emake foo bar
touch foo bar

As expected, both outputs are regenerated: even though foo existed, bar did not, so the commands were executed.

Summary: rules with multiple outputs

Let’s do a quick review of the strategies for creating rules with multiple outputs. For simplicity we can group them into three categories:

  • #pragma multi
  • The naive approach, which does not actually create a single rule with multiple outputs at all.
  • Any of the various hacks for approximating rules with multiple outputs.

Here’s how each strategy fares across a variety of build modes:

Electric Make GNU make
Full (serial) Full (parallel) Incremental Full (serial) Full (parallel) Incremental
#pragma multi N/A
Naive
Hacks


The table paints a grim picture for GNU make: there is no way to implement rules with multiple outputs using standard GNU make which reliably gives both correct results and good performance across all types of builds. The naive approach generates the output files correctly in serial builds, but may fail in parallel builds. The various hacks work for full builds, but may fail in incremental builds. Even in cases where the output files are generated correctly, the build is marred by spurious “is up to date” or “nothing to be done for” messages — which is why most of the entries in the GNU make side are yellow rather than green.

In contrast, #pragma multi allows you to correctly generate multiple outputs from a single rule, for both full and incremental builds, in serial and in parallel. The naive approach also “works” with Electric Make, in that it will produce correct output files, but like GNU make the build is cluttered with spurious warnings. And, unless you have a good history file, the naive approach can trigger conflicts which may negatively impact build performance. Finally, despite its sophisticated conflict detection and correction smarts, even Electric Make cannot ensure correct incremental builds when you’ve implemented one of the multiple output hacks.

So there you have it. This is why we created #pragma multi: without it, there’s just no way to get the job done quickly and reliably. You should give ElectricAccelerator a try.

try_eade_button2

Comments

  1. Soren Soe says:

    Hi Eric,

    I liked your max_args function trick. Works great with gnu make! But my company is using ElectricAccelerator and I get an EC error when using the trick. It appears appending to _args in the foreach loop doesn’t work with emake. Is that true? Is there a work-around? We are using 5.4.2.38725.

    Thanks,
    Soren

  2. Can you achieve this with .INTERMEDIATE in GNU Make? That is, if targets A B C are built at the same time from dependencies D E F, it seems to me that if you write:

    A B C: intermediate
    .INTERMEDIATE: intermediate
    intermediate: D E F
            # recipe that outputs A B C
    

    then this runs the recipe if any of D E F are out of date with respect to any of A B C, and runs the recipe at most once.

    • @Gareth: Great question. Your strategy is a variation on the use of a sentinel file to work around the limitations in gnu make. Unfortunately this strategy is not completely bulletproof. What happens if you delete A, B and C, then “touch intermediate” before invoking the build? None of A, B and C will get rebuilt.

      • I agree: if an actual file intermediate somehow gets created exists, then this approach will break.

      • There’s another subtle problem with this approach: suppose that A, B and C are themselves inputs for other build targets, like this:

        all: thing1 thing2
        
        thing1: A
                @echo building thing1
                @touch thing1
        
        thing2: B
                @echo building thing2
                @touch thing2
        
        A B C: intermediate
        .INTERMEDIATE: intermediate
        intermediate: D E F
                @echo building A B C!
                @touch A B C
        

        Now what happens if you run the build, then delete B, then run the build again? A, B and C will all be regenerated, but only thing2 will be subsequently rebuilt — thing1 will not! Depending on exactly what goes into A, B and C that may be OK, or it may leave you with inconsistent outputs.

        Of course if you then run a third build without otherwise affecting the state of the system, thing1 will be rebuilt, since it is in fact out-of-date relative to its inputs. Confusing stuff!

  3. I have tried to trace all the email threads about this for gmake, but it does not seem like this type of support has ever been added to gmake?

    • @jimsearle Unfortunately no, this functionality has not been added to gmake. There have been a variety of proposals over the years but it seems that nobody can agree on the syntax. But “#pragma multi” is already available in Electric Make, which you can get for free in ElectricAccelerator Huddle!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: