I cooked up this wallpaper for my iPhone the other day, and thought some of my readers might like to use it too. It’s sized to fit the iPhone 4 retina display, but it scales well to the iPhone 3 display too:
Author: Eric Melski
The ElectricAccelerator 6.1 “Ship It!” Award
Having shipped ElectricAccelerator 6.1, I thought you might like to see the LEGO-based “Ship It!” award that I gave each member of the development team. I started this tradition with the 6.0 release last fall. Here’s the baseball card that accompanied the detective minifig I chose for this release:
I picked the detective minifig for the 6.1 release in recognition of the significant improvements to Accelerator’s diagnostic capabilities (like cyclic redundancy checks to detect faulty networks, and MD5 checksums to detect faulty disks). Compared to the 6.0 award not much has changed in the design, although I did get my hands on the “official” corporate font this time. It strikes me that there’s a lot of wasted space on the back of the card though. Next time I’ll make better use of the space by incorporating statistics about the release. I actually have the design all ready to go, but you’ll have to wait until after the release to see it. Don’t fret though, the 6.2 release is expected soon!
Fixing recursive make
Recursive make is one of those things that everybody loves to hate. It’s even been the subject of one of those tired “… Considered Harmful” diatribes. According to popular opinion, recursive make will sap performance from your build, make it nigh impossible to ensure correctness in parallel builds, and may render the user sterile. OK, maybe not that last one. But seriously, the arguments against recursive make are legion, and deeply entrenched. The problem? They’re flawed. That’s because they assume there’s only one way to implement recursive make — when the submake is invoked, the parent make is blocked until the submake completes. That’s how almost everybody does it. But in Electric Make, part of ElectricAccelerator, we developed a novel new approach called non-blocking recursive make. This design eliminates the biggest problems attributed to recursive make, without requiring a painful and costly conversion of your build system to non-recursive make.
The problem with traditional recursive make
There’s really just two problems at the heart of complaints with traditional recursive make: first, there’s no way to ensure correctness of a parallel recursive make based build without overserializing the submakes, because there’s no way to articulate dependencies between individual targets in different submakes. That means you can’t have a dependency graph that is both correct and precise. Instead you either leave out the critical dependency entirely, which makes parallel (ie, fast) builds unreliable; or you serialize submakes in their entirety, which shackles build performance because no part of a submake with even a single dependency on some portion of an earlier submake can begin until the entire ealier submake completes. Second, even if there were a way to specify precise dependencies between targets in different submakes, most versions of make have implemented recursive make such that the parent make is blocked from proceeding until the submake has completed. Consider a typical use of recursive make with implicit serializations between submakes:
1
2
3
4
|
all: @for dir in util client server ; do \ $(MAKE) -C $$dir; \ done |
Each submake compiles a bunch of source files, then links them together into a library (util) or an executable (client and server). The only actual dependency between the work in the three make instances is that the client and server programs need the util library. Everything else is parallelizable, but with traditional recursive make, gmake is unable to exploit that parallelism: all of the work in the util submake must finish before any part of the client submake begins!
Conflict detection and non-blocking recursive make
If you’re familiar with Electric Make, you already know how it solves the first half of the recursive make problem: conflict detection and correction. I’ve written about conflict detection before, but here’s a quick recap: using the explicit dependencies given in the makefiles and information about the files accessed as each target is built, emake is able to dynamically determine when targets have been built too early due to missing explicit dependencies, and rerun those targets to generate the correct output. Electric Make can ensure the correctness of parallel builds even in the face of incomplete dependencies, even if the missing dependencies are between targets in different submakes. That means you need not serialize entire submakes to ensure the build will run correctly in parallel.
Like an acrobat’s safety net, conflict detection allows us to consider solutions to the other half of the problem that would otherwise be considered risky, if not outright madness. In fact, our solution would not be possible without conflict detection: non-blocking recursive make. This is analogous to the difference between blocking and non-blocking I/O: rather than waiting for a recursive make to finish, emake carries on executing subsequent commands in the build immediately, including other recursive makes. Conflict detection ensures that only the commands in each submake which require serialization are executed sequentially, so the build runs as quickly as possible, but the final build output is identical to a serial build.
The impact of this change is dramatic. Here I’ve plotted the execution of the simple build defined above on four cores, using both gmake (normal recursive make) and emake (non-blocking recursive make):
Electric Make is able to execute this build about 20% faster than gmake, with no changes to the Makefiles or the execution environment. emake is literally able to squeeze more parallelism out of recursive-make-based builds than gmake. In fact, we can precisely quantify just how much more parallelism emake gets through an application of Amdahl’s law. First, we compute the best possible speedup for the build — that’s just the serial runtime divided by the best possible parallel runtime, which we can figure out through analysis of the depedency graph and runtime of individual jobs in the build (the Longest Serial Chain report in ElectricInsight can do this for you). Then we can compute the parallelizable portion P of the build by plugging the speedup S into this equation: P = 1 – (1 / S). Here’s how that works out for gmake and emake:
gmake | emake | |
---|---|---|
Serial baseline | 65s | 65s |
Best build time | 13.5s | 7.5s |
Best speedup | 4.8x | 8.7x |
Parallel portion | 79% | 89% |
On this build, non-blocking recursive make increases the parallel portion of the build by 10%. That may not seem like much, but Amdahl’s law shows how dramatically that difference affects the speedup you can expect as you apply more cores:
Implementation
On the backend, non-blocking recursive make is handled by conflict detection — the jobs from the recursive make are checked for conflicts in the serial order defined by the makefile structure. Any issues caused by aggressively running recursive makes early are detected during the conflict check, and the target that ran too early is rerun to generate the correct result.
On the frontend, emake uses a strategy that is at once both brilliant in its simplicity, and diabolical in its trickery. It starts with an environment variable. When emake is invoked recursively, it checks the value of EMAKE_BUILD_MODE. If it is set to node, emake runs in so-called stub mode: rather than executing the submake (parsing the makefile and building targets), emake captures the invocation context (working directory, command-line and environment) in a file on disk, prints a “magic” string and exits with a zero status code.
The file containing the invocation context is identified by a second environment variable, ECLOUD_RECURSIVE_COMMAND_FILE. The Accelerator agent (which handles invoking commands on behalf of emake) checks for the presence of that file after every command that is run. If it is found, the agent relays the content to the toplevel emake invocation, where a new make instance is created to represent the submake invocation. That instance comes with it’s own parse job of course, which gets inserted into the queue of jobs. Some (short) time later, the parse job will run, discover whatever work must be run by the submake, and create additional rule jobs.
The magic string — EMAKE_FNORD — serves as a placeholder in the stdout stream for the jobs, so emake can figure out which portion of the output text comes before and which portion comes after the submake. This ensures that the build output log is identical to that generated by a serialized gmake build. For example, given the following rule that invokes a submake, you’d expect to see the “Before” and “After” messages printed before and after the output generated by commands in the submake itself:
1
2
3
4
|
all: @echo Before util ; \ @$(MAKE) -C util ; \ @echo After util |
With non-blocking recursive make, the submake has not actually executed when the “echo After util” command runs. If emake doesn’t account for that reordering, both the “Before” and “After” messages will appear before any of the output from the submake. EMAKE_FNORD allows emake to “stitch” the output together so the build log matches a serial log.
Limitations
Conflict detection and non-blocking recursive make together solve the main problems associated with recursive make. But there are a couple scenarios where non-blocking recursive make does not work well. Fortunately, these are uncommon in practice and easily addressed.
Capturing recursive make stdout
The first scenario is when the build captures the output of the recursive make invocation, rather than letting it print to stdout as normal. Since emake defers the execution of the submake and prints only EMAKE_FNORD to stdout, this will not work. There are two reasons you might do this: first, you might want to have separate build logs for each submake, to simplify error detection and management. In this situation, the simplest workaround is to remove the redirection and instead us emake’s annotated build log, an XML version of the build output log which can be easily processed using standard tools. Second, you may be using make as a text-processing tool (sort of a “poor man’s” Perl), rather than for building per se:
1
2
3
|
all: @$(MAKE) -f genlist.mk > objects.txt @cat objects.txt | xargs rm |
In this case, the workaround is to explicitly force emake to run in so-called “local” mode, which means emake will handle the recursive make invocation as a blocking invocation, just like traditional make would. You can force emake into local mode by adding EMAKE_BUILD_MODE=local to the environment before the recursive make invocation.
Immediate consumption of build products
The second scenario is when the build consumes the product of the submake in the same command that contains the invocation. For example:
1
2
|
all: @$(MAKE) -C sub foo && cp sub/foo ./foo |
Here the build assumes that the output files generated by the submake will be available for use immediately after the submake completes. Obviously this is not the case with non-blocking recursive make — when the invocation of $(MAKE) -C sub foo completes, only the submake stub has actually finished. The build products will not be available until after the submake is actually processed later. Note that in this build both the recursive make invocation and the commands that use the build products from that invocation are treated as a single command from the perspective of make: make actually invokes the shell, and the shell then runs the recursive make and cp commands.
The workaround is simple: split the consumer into a distinct command, from the perspective of make:
1
2
3
|
all: @$(MAKE) -C sub foo @cp sub/foo ./foo |
With that trivial change, emake is able to treat the cp as a continuation job, which can be serialized against the completion of the recursive make as needed.
A fix for recursive make
For years, people have heaped scorn and criticism on recursive make. They’ve nearly convinced everybody that even considering its use is automatically wrong — you probably can’t help feeling a little bit guilty when you use recursive make. But the reality is that recursive make is a reasonable way to structure a large build. You just need a better make. With conflict detection and non-blocking recursive make, Electric Make has fixed the problems usually associated with recursive make, so you can get parallel builds that are both fast and correct. Give it a try!
What’s new in ElectricAccelerator 6.1
Electric Cloud announced the release of ElectricAccelerator 6.1 a few weeks ago, the 24th feature release of ElectricAccelerator since the company was founded in 2002. This release incorporates several enhancements that make the product more robust and more flexible. Here are the key additions:
Workload reporting
In many large organizations, a single massive Accelerator cluster is shared across multiple development teams, in order to reduce administration overhead and improve hardware utilization — naturally each team has “their” agents, but if those are not in use, they can be easily made available to other teams. Often, the maintenance cost is shared by the teams using the cluster, according to the amount they use the cluster. In order to facilitate this use case, Accelerator 6.1 adds workload reporting. At the conclusion of each build, emake reports the total CPU usage for the build to the cluster manager. The administrator can see a summary of usage by running the “Build Users” report from the cluster manager web interface:
Multi-interface network support
In data centers it’s not uncommon to find servers configured with two network interfaces: for example, a gigabit connection for communication with systems outside the data center, and a 10 GigE, fiber-optic or Infiniband connection for extremely high-bandwidth communication with other systems inside the data center. Accelerator 6.1 adds explicit support for this configuration, so data transfers between agents in the cluster utilize the high-bandwidth secondary interface for increased performance.
Strong checksums for data integrity
For years Accelerator has relied on the checksum built into the TCP standard to ensure integrity of network data transfers. Unfortunately that checksum is relatively weak, and in rare cases we found it was insufficient to detect data errors introduced by faulty network hardware. In this release, we added an application-layer checksum to further guard against such problems. Accelerator 6.1 uses CRC32-c, chosen for its robust error detection capabilities and high performance.
Expanded platform support
Accelerator 6.1 adds support for a few new platforms and third-party tools, including:
- Ubuntu 11.04
- ClearCase 8
What’s next?
It feels great to have wrapped up Accelerator 6.1, but we’re not resting on our laurels. We’ve already done a lot of work for 6.2, which includes some key robustness improvements that have been literally years in the making; and we’re getting started on 7.0, which currently is planned to include some really exciting new performance enhancements around incremental builds — more on that soon.
Accelerator 6.1 is available immediately for current customers. New customers should contact sales@electric-cloud.com.
Measuring the Electric File System
Somebody asked me the other day what portion of the Electric File System (EFS) is shared code versus platform-specific code. The EFS, if you don’t know, is a custom filesystem driver and part of ElectricAccelerator. It enables us to virtualize the filesystem, so that each build job sees the correct view of the filesystem according to its logical position in the build, even if jobs are run out of order. It also provides the file usage information that powers our conflict detection algorithm. As a filesystem driver, the EFS is obviously tightly coupled to the platforms it’s used on, and the effort of porting it is one reason we don’t support more platforms than we do now — not that a dozen variants of Windows, 16 flavors of Linux and several versions of Solaris is anything to be ashamed of!
Anyway, the question intrigued me, and I found the answer quite surprising (click for full-size):
Note that here I’m measuring only actual code lines, exclusive of comments and whitespace, as of the upcoming 6.1 release. In total, the Windows version of the EFS has about 2x the lines of code that either the Solaris or Linux ports have. The platform-specific portion of the code is more than 6x greater!
Why is the Windows port so much bigger? To answer that question, I started looking at the historical size of the three ports, which lead to the following graph showing the total lines of code for (almost) every release we’ve made. Unfortunately our first two releases, 1.0 and 2.0, have been lost to the ether at some point over the past ten years, but I was able to collect data for every release starting from 2.1 (click for full-size):
Here you can see that the Windows port has always been larger than the others, but it’s really just a handful of Windows-specific features that blew up the footprint so much. The first of those was support for the Windows FastIO filesystem interface, an alternative “fast path” through the kernel that in theory enables higher throughput to and from the filesystem. It took us two tries to get that feature implemented, as shown on the graph, and all-told that contributed about 7,000 lines of code. The addition of FastIO to the filesystem means that the Windows driver has essentially three I/O interfaces: standard I/O, memory-mapped I/O and FastIO. In comparison, on Linux and Solaris you have only two: standard I/O and memory-mapped I/O.
The second significant difference between the platforms is that on Windows the EFS has to virtualize the registry in addition to the filesystem. In the 4.3 release we significantly enhanced that portion of the driver to allow full versioning of the registry along the same lines that the filesystem has always done. That feature added about 1,000 lines of code.
I marked a couple other points of interest on this graph as well. First, the addition of the “lightweight EFS” feature, which is when we switched from using RAM to store the contents of all files in the EFS to using temporary storage on the local filesystem for large files. That enabled Accelerator to handle significantly larger files, but added a fair amount of code. Finally, in the most recent release you can actually see a small decrease in the total lines of code on Solaris and Linux, which reflects the long-overdue removal of code that was needed to support legacy versions of those platforms (such as the 2.4.x Linux kernel).
I was surprised as well by the low rate of growth in the Solaris and Linux ports. Originally the Linux port supported only one version of the Linux kernel, but today it supports about sixteen. I guess this result reveals that the difficulty in porting to each new Linux version is not so much the amount of code to be added, but in figuring out exactly which few lines to add!
In fact, after the 4.4 release in early 2009, the growth has been relatively slow on all platforms, despite the addition of major new features to Accelerator as a whole over the last several releases. The reason is simply that most feature development involves changes to other components (primarily emake), rather than to the filesystem driver.
One last metric to look at is the number of unit tests for the code in the EFS. We don’t practice test-driven development for the most part, but we do place a strong emphasis on unit testing. Developers are expected to write unit tests for every change, and we strive to keep the tests isomorphic to the code to help ensure we have good coverage. Here’s how the total number of unit tests has grown over the years (click for full-size):
Note that this is looking only at unit tests for the EFS! We have thousands more unit tests for the other components of Accelerator, as well as thousands of integration tests.
Thankfully for my credibility, the growth in number of tests roughly matches the growth in lines of code! But I’m surprised that the ratio of tests to code is not more consistent across the platforms — that suggests a possible area for improvement. Rather than being discouraged by that though, I’m pleased with this result. After all, what’s the point of measuring something if you don’t use that data to improve?
Another confusing conflict in ElectricAccelerator
After solving the case of the confounding conflict, my user came back with another scenario where ElectricAccelerator produced an unexpected (to him) conflict:
If you run this build without a history file, using at least two agents, you will see a conflict on the continuation job that executes the cp foo bar command, because that job is allowed to run before the job that creates foo in the recursive make invocation. After one run of course, emake records the dependency in history, so later builds don’t make the same mistake.
This situation is a bit different from the symlink conflict I showed you previously. In that case, it was not obvious what caused the usage that triggered the conflict (the GNU make stat cache). In this case, it’s readily apparent: the continuation job reads (or attempts to read) foo before foo has been created. That’s pretty much a text-book example of the sort of thing that causes conflicts.
What’s surprising in this example is that the continuation job is not automatically serialized with the recursive make that precedes it. In a very real sense, a continuation job is an artificial construct that we created for bookkeeping reasons internal to the implementation of emake. Logically we know that the commands in the continuation job should follow the commands in the recursive make. In fact it would be absolutely trivial for emake to just go ahead and stick in a dependency to ensure that the continuation is not allowed to start until after the recursive make finishes, thereby avoiding this conflict even when you have no history file.
Given a choice between two strategies that both produce correct output, emake uses the strategy that produces the best performance in the general case.
Absolutely trivial to do, yes — but also absolutely wrong. Not for correctness reasons, this time, but for performance. Remember, emake is all about maximizing performance across a broad range of builds. Given a choice between two strategies that both produce correct output, emake uses the strategy that produces the best performance in the general case. For continuation jobs, that means not automatically serializing the continuation against the preceding recursive make. I could give you a wordy, theoretical explanation, but it’s easier to just show you. Suppose that your makefile looked like this instead of the original — the difference here is that the continuation job itself launches another recursive make, rather than just doing a simple cp:
Hopefully you agree that the ideal execution of this build would have both foo and bar running in parallel. Forcing the continuation job to be serialized with the preceding recursive make would choke the performance of this build. And just in case you’re thinking that emake could be really clever by looking at the commands to be executed in the continuation job, and only serializing “when it needs to”: it can’t. First, that would require emake to implement an entire shell syntax parser (or several, really, since you can override SHELL in your makefile). Second, even if emake had that ability, it would be thwarted the instant the command is something like my_custom_script.pl — there’s no way to tell what will happen when that gets invoked. It could be a simple filesystem access. It could be a recursive make. It could be a whole series of recursive makes. Even when the command is something you think you recognize, can emake really be sure? Maybe cp is not our trustworthy standard Unix cp, but something else entirely.
Again, all is not lost for this user. If you want to avoid this conflict, you have a couple options:
- Use a good history file from a previous build. This is the simplest solution. You’ll only get conflicts in this build if you run without a history file.
- Refactor the makefile. You can explicitly describe the dependency between the commands in the continuation job and the recursive make by refactoring the makefile so that the stuff in the continuation is instead its own target, thus taking the decision out of emake’s hands. Here’s one way to do that:
Either of these will eliminate the conflict from your build.
ElectricAccelerator and the Case of the Confounding Conflict
A user recently asked me why ElectricAccelerator reports a conflict in this simple build, when executed without a history file from a previous run:
Specifically, if you have at least two agents, emake will report a conflict between symlink_to_foo and foo, indicating that symlink_to_foo somehow read or otherwise accessed foo during execution! But ln does not access the target of a symlink when creating the symlink — in fact, you can even create a symlink to a non-existent file if you like. It seems obvious that there should be no conflict. What’s going on?
To understand why this conflict occurs, you have to wrap your head around two things. First, there’s more going on during a gmake-driven build than just the commands you see gmake invoke. That causes the usage that provokes the conflict. Second, emake considers a serial gmake build the “gold standard” — if a serial gmake build produces a particular result, so too must emake. That’s why the additional usage must result in a conflict.
In this case, the usage that triggers the conflict comes from management of the gmake stat cache. This is a gmake feature that was added to improve performance by avoiding redundant calls to stat() — once you’ve stat()‘d a file once, you don’t need to do it again. Unless the file is changed of course, which happens quite a lot during a build. To keep the stat cache up-to-date as the build progresses, gmake re-stat()‘s each target after it finishes running the commands for the target. So after the commands for symlink_to_foo complete, gmake stat()‘s symlink_to_foo again, using the standard stat() system call, which follows the symlink (in contrast to lstat(), which does not follow the symlink). That means gmake will actually cache the attributes of foo for symlink_to_foo.
To ensure compatibility with gmake, emake has to do the same. In Accelerator parlance, that means we get read usage on symlink_to_foo (because you have to read the symlink itself to determine the target of the symlink), and lookup usage on foo. The lookup on foo causes the conflict, because, of course, you will get a different result if you lookup foo before the job that creates it than you would get if you do the lookup after that job. Before the job, you’ll find that foo does not exist, obviously; after, you’ll find that it does.
But what difference does that make, really? In truth, if there’s no detectable difference in behavior, then it doesn’t matter at all. And in the example build there is no detectable difference — the build output is the same regardless of when exactly you stat() symlink_to_foo relative to when foo is created. But with a small modification to the build, it is suddenly becomes possible to see the impact:
Compare the output when this build is run serially with the output when the build is run in parallel — and note that I’m using gmake, so you can be certain I’m not trying to trick you with some peculiarity of emake’s implementation:
You can plainly see the difference: in the parallel build gmake stat()‘s symlink_to_foo before foo exists, so the stat cache records symlink_to_foo as non-existent. Then when gmake generates the value of $? for reader, symlink_to_foo is excluded, because non-existent files are never considered newer than existing files. In the serial build, gmake stat()‘s symlink_to_foo after foo has been created, so the stat cache indicates that symlink_to_foo exists and is newer than reader, so it is included in $?.
Hopefully you see now both what causes the conflict, and why it is necessary. The conflict occurs because of lookup usage generated when updating the stat cache. The conflict is necessary to ensure that the build output matches that produced by a serial gmake — the “gold standard” for build correctness. If no conflict is declared, there is the possibility for a detectable difference in build output compared to serial gmake.
However, you might be thinking that although it makes sense to treat this as a conflict in the general case, isn’t it possible to do something smarter in this specific case? After all, the orignal example build does not use $?, and without that there isn’t any detectable difference in the build output. So why not skip the conflict?
The answer is simple, if a bit disappointing. In theory it may be possible to elide the conflict by checking to see if the symlink is used by a later job in a manner that would produce a detectable difference (for example, by scanning the commands for subsequent targets for references to $?), but in reality the logistics of that check are daunting, and I’m not confident that we could guarantee correct behavior in all cases.
Fortunately all is not lost. If you wish to avoid this conflict, you have several options:
- Use a good history file from a previous build. This is the most obvious solution. You’ll only get conflicts if you run without a history file.
- Add an explicit dependency. If you make foo an explicit prereq of symlink_to_foo, then you will avoid the conflict. Here’s how that would look:
- Change the serial order. If you reorder the makefile so that symlink_to_foo has an earlier serial order than foo you will avoid the conflict. That just requires a reordering of the prereqs of all:
Any one of these will eliminate the conflict from your build, and you’ll enjoy fast and correct parallel builds.
Case closed.
ElectricMake debug log levels
Often when analyzing builds executed with ElectricMake, all the information you need is in the annotation file — an easily digested XML file containing data such as the relationships between the jobs, the commands run, and the timing of each job. But sometimes you need more detail, and that’s where the emake debug log comes in.
To enable emake debug logging, you specify a pair of command-line arguments: ––emake-debug=value, which specifies the types of debug logging to enable as a set of single-letter values, such as “jng”; and ––emake-logfile=path, which specifies the location of the debug log. In this article I’ll explain each of emake’s debug log levels. Use this index to jump to the definition of a specific log level:
- a: agent allocation
- c: cache
- e: environment
- f: filesystem
- g: profiling
- h: history
- j: job
- L: nmake lexer
- l: ledger
- m: memory
- n: node
- o: parse output
- p: parse
- r: parse relocation
- s: subbuild
- Y: security
DISCLAIMER: emake debug logs are intended for use by Electric Cloud engineering and support staff. Debug logging contents and availability are subject to change in any release, for any or no reason. Enter at your own risk, your mileage may vary, etc. etc. The information in this article refers to ElectricAccelerator 6.0.
a: agent allocation
Agent allocation logging provides detailed information about emake’s attempts to procure agents from the cluster manager during the build. If you think emake may be stalled trying to acquire agents, allocation logging will help to understand what’s happening.
c: cache
Cache logging records details about the filesystem cache used by emake to accelerate parse jobs in cluster builds. For example, it logs when a directory’s contents are added to the cache, and the result of lookups in the cache. Since it is only used during remote parse jobs, you’ll have to use it with the ––emake-rdebug=value option. Use cache logging if you suspect a problem with the cached local filesystem.
e: environment
Environment logging augments node logging with a dump of the entire environment block for every job as it is sent to an agent. Normally this is omitted because it’s quite verbose (could be as much as 32KB per job). Usually you’re better off using env-level annotation, which is more compact and easier to parse.
f: filesystem
Filesystem logging records numerous details about emake’s interaction with its versioned filesystem data structure. In particular, it logs every time that emake looks up a file (when doing up-to-date checks, for example), and it logs every update to the versioned file system caused by file usage during the build’s execution. This level of logging is very verbose, so you shouldn’t enable it as a general rule. It’s most often used when diagnosing issues related to the versioned filesystem and conflicts.
g: profiling
Profiling logging is one of the easiest to interpret and most useful types of debug logging. When enabled, emake will emit hundreds of performance metrics at the end of the build. This is a very lightweight logging level, and is safe (even advisable) to enable for all builds.
h: history
History logging prints messages related to the data tracked in the emake history file — both filesystem dependencies and autodep information. When history logging is enabled, emake will print a message every time a dependency is added to the history file, and it will print information about the files checked during up-to-date checks based on autodep data. Enable history logging if you suspect a problem with autodep behavior.
j: job
Job logging prints minimal messages related to the creation and execution of jobs. For each job you’ll see a message when it starts running, when it finishes running, and when emake checks the job for conflicts. If there is a conflict in the job, you’ll see a message about that too. If you just want a general overview of how the build is progressing, j-level logging is a good choice.
L: nmake lexer
emake uses a generated parser to process portions of nmake makefiles. Lexer debug logging enables the debug logging in that generated code. This is generally not useful to end users as it is too low-level.
l: ledger
Ledger debug logging prints information about build decisions based on data in the ledger file, as well as updates made to the ledger file. Enable it if you believe the ledger is not functioning correctly.
m: memory
When memory logging is enabled, emake will print memory usage metrics to the debug log once per second. This includes the total process memory usage as well as current and peak memory usage grouped into several “buckets” corresponding to various types of data in emake. For example, the “Operation” bucket indicates the amount of memory used to store file operations; the “Variable” bucket is the amount of memory used for makefile variables. This is most useful when you are experiencing an out-of-memory failure in emake, as it can provide guidance as to how memory is being utilitized during the build, and how quickly it is growing.
n: node
Node logging prints detailed information about all messages between emake and the agents, including filesystem data and commands executed. Together with job logging, this can give a very comprehensive picture of the behavior of a build. However, node logging is extremely verbose, so you should enable it only when you are chasing a specific problem.
o: parse output
When parse output logging is enabled, emake will preserve the raw result of parsing a makefile. The result is a binary file containing information about all the targets, rules, dependencies and variables extracted from makefiles read during a parse job. This can be useful when investigating parser incompatibility issues and scheduling issues (for example, if a rule is not being scheduled for execution when you expect). Note that this debug level only makes sense when parsing, which means you have to specify it in the ––emake-rdebug option. The parse results will be saved in the ––emake-rlogdir directory, named as parse_jobid.out. Note that the directory may be on the local disk of the remote nodes, depending on the value you specify!
p: parse
Parse debug logging prints extremely detailed information about the reading and interpretation of makefiles during a parse job. This is most useful when investigating parser compatibility issues. This output is very verbose, so you should only enable this when pursuing a specific problem. Note that like parse output logging, this debug level only makes sense during parsing, which means you have to specify it in the ––emake-rdebug option. The parse log files will be saved in the ––emake-rlogdir directory, named as parse_jobid.dlog. Note that the directory may be on the local disk of the remote nodes, depending on the value you specify!
r: parse relocation
Parse relocation logging prints low-level information about the process of transmitting parse result data to emake at the end of a parse job. It’s only used internally when we are extending the parse result format, and so is unlikely to be of interest to end users.
s: subbuild
Subbuild logging prints details about decisions made while using the emake subbuild feature. You should enable it if you believe that the subbuild feature is not working correctly.
Y: authentication
Authentication logging is a subset of node logging that prints only those messages related to authenticating emake to agents and vice-versa. If you are having problems using the authentication feature, you should enable this debug level.
Best of blog.melski.net – December 2011
Here are the posts that got the most views in December 2011:
- Shell commands in GNU Make: 18% of views
- LEGO Ship It! Awards: 13% of views
- How ElectricMake Guarantees Reliable Parallel Builds: 13% of views
- 6 reasons your development team should be using instant messaging: 12% of views
- HOWTO: use Gource with Perforce: 10% of views
Makefile hacks: automatically split long command lines
If you’ve worked on a large build system you’ve probably bumped into this error, or one like this:
gmake: execvp: /bin/sh: Argument list too long
This error means the length of some command-line in your makefile has grown past the system limit, which is typically in the 32 to 256 kilobyte range. It’s surprisingly easy to hit that limit. You start with a small list of object files to be linked together. Over time you add more, and the command-line gets a little longer. Add a few more and it gets longer still. Before you know it you have a monster command-line and your build starts failing.
The solution to this problem is simple: split the long command-line into several shorter command-lines. For example, ar r libraries/lib.a objects/foo.o objects/bar.o objects/baz.o objects/boo.o objects/bang.o becomes something like this:
ar r libraries/lib.a objects/foo.o objects/bar.o ar r libraries/lib.a objects/baz.o objects/boo.o ar r libraries/lib.a objects/bang.o
Simple in theory, but tedious to do by hand. And doing it manually is like putting a ticking time-bomb into your makefile — it’s only a matter of time before your build grows enough that you have to go through this exercise again.
I recently ran across a clever solution that exploits the $(eval) function in GNU make to split long command-lines automatically, eliminating the tedium and the time-bomb. After I show you the solution, I’ll explain it piece-by-piece.
The max_args function
The solution is a user-defined function called max_args that splits long command-lines into equal-length chunks:
1
2
3
4
5
6
7
8
9
|
define max_args $(eval _args:=) $(foreach obj,$3,$(eval _args+=$(obj))$(if $(word $2,$(_args)),$1$(_args)$(EOL)$(eval _args:=))) $(if $(_args),$1$(_args)) endef define EOL
endef |
And an example of its use:
1
2
3
|
OBJS:=a b c d e f g h all: @$(call max_args,echo,2,$(OBJS)) |
The max_args function takes three parameters: the base command-line, the number of arguments per “chunk”, and the complete list of arguments. It expands to a series of command-lines — one for each chunk of arguments.
The trick behind max_args is the use of $(eval) to update a variable as a side-effect of gmake’s regular variable expansion activity. If you’re not familiar with gmake variable expansion, here’s a quick rundown: when gmake finds a variable or function reference, like $(something), it replace the entire reference with an expanded value. In the case of a variable that’s just the value of the variable. Most variables in gmake are recursive which means that if the variable value itself contains embedded variable references, those will be expanded as well, recursively. In the case of a function, gmake evaluates the function, and replaces the reference with the computed value.
The meat of max_args is on line 3. It starts with the $(foreach) function, which evaluates its third argument, the body of the loop, once for each word in its second argument — in this case, the list of objects passed in the call to max_args.
In max_args, the loop body has two components. The first is a call to $(eval), which simply appends the current value of the loop variable to an accumulator called _args.
The second component of the loop body uses $(if) and $(word) to check the length of _args. The $(word) function returns the nth word from a list, or an empty string if there are fewer than n words in the list. The $(if) function expands its second argument (the then clause) only if its first argument (the condition) expands to a non-empty string, so together these functions check if _args has the desired number of words, and if so the then clause of the $(if) is expanded.
The then clause of this $(if) has two components. The first constructs a completed command-line by concatenating the base command-line, here given by $1, the first argument to the original max_args call; the accumulated arguments; and a newline character. Thanks to the rules of gmake expansion, this command-line is added to the overall expansion result for the max_args function. The second part of the then clause uses $(eval) to reset the accumulator
If the chunk size does not evenly divide the number of arguments, the stragglers are emitted in a final command-line on the last line of max_args.
Limitations
max_args is handy but it has one significant limitation: command-line length limits are based on the number of bytes in the command-line, not the number of words, in it. Unfortunately, gmake has no built-in way to count the number of characters in a string. gmake does provide the $(words) built-in, so that’s what max_args uses. That just means that to use it effectively you have to take a guess at the number of arguments that will fit in a single command-line, for example by dividing the length limit by the average number of characters in each argument, then subtracting some to allow some buffer for outliers.