post

The ElectricAccelerator 6.2 “Ship It!” Award

Obviously with ElectricAccelerator 6.2 out the door, it’s time for a new “Ship It!” award. I picked the mechanic figure for this release because the main thrust of the release was to add some long-desired robustness improvements. Here’s the trading card that accompanied the figure:

Greased Lighting — it’s electrifyin’!

Loaded with metrics and analysis goodness!

As promised this iteration of the award includes some metrics comparing this release to previous releases:

  • Number of days in development. We spent 112 days working on the 6.2 release; the range of all feature releases is 80 days on the low end to 249 days at the high end.
  • JIRA issues closed. We closed out 92 issues with this release, including both defects and enhancement requests. The fewest we’ve done was 9 issues; the most was 740 issues.
  • Composition of issues. Of the 92 issues, about 55% were classified as “defects”, and the remaining 45% were “features” of varying magnitude.

Again, a big “Thank You!” goes out to the ElectricAccelerator team! I’m really excited to be working with such a talented group, and I can’t wait to show the world what we’re doing next!

post

What’s new in ElectricAccelerator 6.2?

We released ElectricAccelerator 6.2 a couples weeks ago, our 25th feature release. 6.2 was a quick interim release primarily intended to address a couple long-standing stability issues, but we managed to squeeze in some really interesting feature enhancements as well. Here’s what’s new:

Rules with multiple outputs? Yeah, we can do that.

Every now and then, makefile authors need to write a single makefile rule that produces more than one output file, to accomodate tools that don’t fit gmake’s rigid one-command-one-output model. The classic example is bison, which produces both a C file and a header file from a single invocation of the tool.

Unfortunately in regular gmake the only way to write a rule with multiple outputs is to use a pattern rule. That’s great — if your needs happens to dovetail with the caveats and limitations of pattern rules (chiefly, that the output files share a common base name). If not, the answer has been basically that you’re out of luck. There are a variety of kludges that approximate the behavior, but despite numerous requests over the last decade (1, 2, 3, 4, 5, 6, 7, 8) and at least one patch implementing the feature, GNU make (as of 3.82) still has no way to create an explicit rule that produces multiple outputs.

When it comes to enhancements to the fundamental operation of GNU make, we’ve historically let the GNU make team take the lead, rather than risk introducing potentially incompatible changes. But after so many years it seems clear that this feature is not going to show up in GNU make — so we decided to forge ahead on our own. Enter #pragma multi:

1
2
3
#pragma multi
foo bar:
@touch foo bar

GNU make interprets this construct as two independent rules, one for foo and one for bar, which happen to each create both files. Thanks to the #pragma multi designation, Electric Make will interpret this as a single rule which produces both foo and bar. Using a #pragma to flag the rule is perfect, because it sidesteps any questions about syntax changes. And since #pragma starts with a #, GNU make will treat it as a comment, so this makefile will still be usable with GNU make — you’ll just get correct behavior and better performance with Electric Make.

New platforms and a faster installer

Accelerator 6.2 adds support for Linux kernels up to 3.5.x, which means that Accelerator now supports the following platforms:

  • Ubuntu 11.10
  • Ubuntu 12.04
  • SUSE Linux Enterprise Server 11 SP2

In addition, Accelerator 6.2 is expected to work correctly on both Ubuntu 12.10 and Windows 8, although we cannot officially claim support for those platforms since they were themselves not finalized at the time Accelerator 6.2 was released. This release also incorporates enhancements to the Linux installer which make the installation process about 25% faster compared to previous releases.

A complete list of platforms supported by ElectricAccelerator 6.2 can be found in the Electric Cloud Knowledge Base.

Key robustness improvements

Raise your hand if you’ve ever seen this error on your Linux Accelerator agent hosts:

unable to unmount EFS at “/some/path”: EBUSY

That error shows up sometimes when your build starts background processes — kind of a distributed build anti-pattern itself, but unfortunately it’s not always something you can control thanks to some third-party toolchains. Or rather, that error used to show up sometimes, because in Accelerator 6.2 we’ve bulletproofed the system against such rogue background processes, so that error is a thing of the past (nota bene: this enhancement is not available on Solaris).

In addition, we bulletproofed the system against external processes (any process running on an agent host which is not part of your build) accessing the EFS. In certain rare circumstances, such accesses could lead to agent host instability.

What’s next?

With 6.2 out the door we’ve finally got bandwidth to work on 7.0, which will focus on some very exciting performance improvements, especially for incremental builds. It’s a little bit too early to share any of the preliminary results we’re seeing, but rest assured — if you thought Accelerator was fast before, well… you ain’t seen nothing yet! Stay tuned for more information.

ElectricAccelerator 6.2 is available immediately. If you are already an Accelerator user, contact support@electric-cloud.com to upgrade. If you are not currently a user, you can download a free evaluation version of ElectricAccelerator Developer Edition, or contact sales@electric-cloud.com.

post

Rapidly detecting Linux kernel source features for driver deployment

A while back I wrote about genconfig.sh, a technique for auto-detecting Linux kernel features, in order to improve portability of an open source, out-of-tree filesystem driver I developed as part of ElectricAccelerator. genconfig.sh has enabled me to maintain compatibility across a wide range of Linux kernels with relative ease, but recently I noticed that the script was unacceptably slow. On the virtual machines in our test lab, genconfig.sh required nearly 65 seconds to execute. For my 11-person team, a conservative estimate of time wasted waiting for genconfig.sh is nearly an entire person-month per year. With a little effort, I was able to reduce execution time to about 7 seconds, nearly 10x faster! Here’s how I did it.

A brief review of genconfig.sh

genconfig.sh is a technique for detecting the source code features of a Linux kernel. Like autoconf configure scripts, genconfig.sh uses a series of trivial C source files to probe for various kernel source features, like the presence or absence of the big kernel lock. For example, here’s the code used to determine whether the kernel has the set_nlink() helper function:

1
2
3
4
5
#include <linux/fs.h>
void dummy(struct inode *i)
{
set_nlink(i, 0);
}

If a particular test file compiles successfully, the feature is present; if the compile fails, the feature is absent. The test results are used to set a series of C preprocessor #define directives, which in turn are used to conditionally compile driver code suitable for the host kernel.

Reaching the breaking point

When I first implemented genconfig.sh in 2009 we only supported a few versions of the Linux kernel. Since then our support matrix has swollen to include every variant from 2.6.9 through 3.5.0, including quirky “enterprise” distributions that habitually backport advanced features without changing the reported kernel version. But platform support tends to be a mostly one-way street: once something is in the matrix, it’s very hard to pull it out. As a consequence, the number of feature tests in genconfig.sh has grown too, from about a dozen in the original implementation to over 50 in the latest version. Here’s a real-time recording of a recent version of genconfig.sh on one of the virtual machines in our test lab:

genconfig.sh executing on a test system; click for full size

Accelerator actually has two instances of genconfig.sh, one for each of the two kernel drivers we bundle, which means every time we install Accelerator we burn about 2 minutes waiting for genconfig.sh — 25% of the 8 minutes total it takes to run the install. All told I think a conservative estimate is that this costs my team nearly one full person-month of time per year, between time waiting for CI builds (which do automated installs), time waiting for manual installs (for testing and verification) and my own time spent waiting when I’m making updates to support new kernel versions.

genconfig.sh: The Next Generation

I had a hunch about the source of the performance problem: the Linux kernel build system. Remember, the pattern repeated for each feature test in the original genconfig.sh is as follows:

  1. Emit a simple C source file, called test.c.
  2. Invoke the kernel build system, using make and a trivial kernel module makefile:
    1
    2
    3
    conftest-objs := test.o
    obj-m := conftest.o
    EXTRA_CFLAGS += -Werror
  3. Check the exit status from make to decide whether the test succeeded or failed.

The C sources used to probe for features are trivial, so it seemed unlikely that the compilation itself would take so long. But we don’t have to speculate — if we use Electric Make instead of GNU make to run the test, we can use the annotated build log and ElectricInsight to see exactly what’s going on:

ElectricInsight visualization of Linux kernel module build, click for full size.

Overall, using the kernel build system to compile this single file takes nearly 2 seconds — not a big deal with only a few tests, but it adds up quickly. To be clear, the only part we actually care about is the box labeled /root/__conftest__/test.o, which is about 1/4 second. The remaining 1 1/2 seconds? Pure overhead. Perhaps most surprising is the amount of time burned just parsing the kernel makefiles — the huge bright cyan box on the left edge, as well as the smaller bright cyan boxes in the middle. Nearly 50% of the total time is just spent parsing!

At this point an idea struck me: there’s no particular reason genconfig.sh must invoke the kernel build system separately for each probe. Why not write out all the probe files upfront and invoke the kernel build system just once to compile them all in a single pass? In fact, with this strategy you can even use parallel make (eg, make -j 4) to eke out a little bit more speed.

Of course, you can’t use the exit status from make with this approach, since there’s only one invocation for many tests. Instead, genconfig.sh can give each test file a descriptive name, and then check for the existence of the corresponding .o file after make finishes. If the file is present, the feature is present; otherwise the feature is absent. Here’s the revised algorithm:

  1. Emit a distinct C source file for each kernel feature test. For example, the sample shown above might be created as set_nlink.c. Another might be write_begin.c.
  2. Invoke the kernel build system, using make -j 4 and a slightly less trivial kernel module makefile:
    1
    2
    3
    conftest-objs := set_nlink.o write_begin.o ...
    obj-m := conftest.o
    EXTRA_CFLAGS += -Werror
  3. Check for the existence of each .o file, using something like if [ -f set_nlink.o ] ; then … ; fi to decide whether the test succeeded or failed.

The net result? After an afternoon of refactoring, genconfig.sh now completes in about 7 seconds, nearly 10x faster than the original:

Updated genconfig.sh executing on a test system, click for full size.

The only drawback I can see is that the script no longer has that satisfying step-by-step output, instead dumping everything out at once after a brief pause. I’m perfectly happy to trade that for the improved performance!

Future Work and Availability

This new strategy has significantly improved the usability of genconfig.sh. After I finished the conversion, I wondered if the same technique could be applied to autoconf configure scripts. Unfortunately I find autoconf nearly impossible to work with, so I didn’t make much progress exploring that idea. Perhaps one of my more daring (or stubborn) readers will take the ball and run with it there. If you do, please comment below to let us know the results!

The new version of genconfig.sh is used in ElectricAccelerator 6.2, and can also be seen in the open source Loopback File System (lofs) on my Github repo.

post

The ElectricAccelerator 6.1 “Ship It!” Award

Having shipped ElectricAccelerator 6.1, I thought you might like to see the LEGO-based “Ship It!” award that I gave each member of the development team. I started this tradition with the 6.0 release last fall. Here’s the baseball card that accompanied the detective minifig I chose for this release:

The great detective is on the case!

The Accelerator 6.1 team

I picked the detective minifig for the 6.1 release in recognition of the significant improvements to Accelerator’s diagnostic capabilities (like cyclic redundancy checks to detect faulty networks, and MD5 checksums to detect faulty disks). Compared to the 6.0 award not much has changed in the design, although I did get my hands on the “official” corporate font this time. It strikes me that there’s a lot of wasted space on the back of the card though. Next time I’ll make better use of the space by incorporating statistics about the release. I actually have the design all ready to go, but you’ll have to wait until after the release to see it. Don’t fret though, the 6.2 release is expected soon!

post

Fixing recursive make

Recursive make is one of those things that everybody loves to hate. It’s even been the subject of one of those tired “… Considered Harmful” diatribes. According to popular opinion, recursive make will sap performance from your build, make it nigh impossible to ensure correctness in parallel builds, and may render the user sterile. OK, maybe not that last one. But seriously, the arguments against recursive make are legion, and deeply entrenched. The problem? They’re flawed. That’s because they assume there’s only one way to implement recursive make — when the submake is invoked, the parent make is blocked until the submake completes. That’s how almost everybody does it. But in Electric Make, part of ElectricAccelerator, we developed a novel new approach called non-blocking recursive make. This design eliminates the biggest problems attributed to recursive make, without requiring a painful and costly conversion of your build system to non-recursive make.

The problem with traditional recursive make

There’s really just two problems at the heart of complaints with traditional recursive make: first, there’s no way to ensure correctness of a parallel recursive make based build without overserializing the submakes, because there’s no way to articulate dependencies between individual targets in different submakes. That means you can’t have a dependency graph that is both correct and precise. Instead you either leave out the critical dependency entirely, which makes parallel (ie, fast) builds unreliable; or you serialize submakes in their entirety, which shackles build performance because no part of a submake with even a single dependency on some portion of an earlier submake can begin until the entire ealier submake completes. Second, even if there were a way to specify precise dependencies between targets in different submakes, most versions of make have implemented recursive make such that the parent make is blocked from proceeding until the submake has completed. Consider a typical use of recursive make with implicit serializations between submakes:

1
2
3
4
all:
@for dir in util client server ; do \
$(MAKE) -C $$dir; \
done

Each submake compiles a bunch of source files, then links them together into a library (util) or an executable (client and server). The only actual dependency between the work in the three make instances is that the client and server programs need the util library. Everything else is parallelizable, but with traditional recursive make, gmake is unable to exploit that parallelism: all of the work in the util submake must finish before any part of the client submake begins!

Conflict detection and non-blocking recursive make

If you’re familiar with Electric Make, you already know how it solves the first half of the recursive make problem: conflict detection and correction. I’ve written about conflict detection before, but here’s a quick recap: using the explicit dependencies given in the makefiles and information about the files accessed as each target is built, emake is able to dynamically determine when targets have been built too early due to missing explicit dependencies, and rerun those targets to generate the correct output. Electric Make can ensure the correctness of parallel builds even in the face of incomplete dependencies, even if the missing dependencies are between targets in different submakes. That means you need not serialize entire submakes to ensure the build will run correctly in parallel.

Like an acrobat’s safety net, conflict detection allows us to consider solutions to the other half of the problem that would otherwise be considered risky, if not outright madness. In fact, our solution would not be possible without conflict detection: non-blocking recursive make. This is analogous to the difference between blocking and non-blocking I/O: rather than waiting for a recursive make to finish, emake carries on executing subsequent commands in the build immediately, including other recursive makes. Conflict detection ensures that only the commands in each submake which require serialization are executed sequentially, so the build runs as quickly as possible, but the final build output is identical to a serial build.

The impact of this change is dramatic. Here I’ve plotted the execution of the simple build defined above on four cores, using both gmake (normal recursive make) and emake (non-blocking recursive make):

Recursive make build with gmake


Recursive make build with emake

Electric Make is able to execute this build about 20% faster than gmake, with no changes to the Makefiles or the execution environment. emake is literally able to squeeze more parallelism out of recursive-make-based builds than gmake. In fact, we can precisely quantify just how much more parallelism emake gets through an application of Amdahl’s law. First, we compute the best possible speedup for the build — that’s just the serial runtime divided by the best possible parallel runtime, which we can figure out through analysis of the depedency graph and runtime of individual jobs in the build (the Longest Serial Chain report in ElectricInsight can do this for you). Then we can compute the parallelizable portion P of the build by plugging the speedup S into this equation: P = 1 – (1 / S). Here’s how that works out for gmake and emake:

gmake emake
Serial baseline 65s 65s
Best build time 13.5s 7.5s
Best speedup 4.8x 8.7x
Parallel portion 79% 89%

On this build, non-blocking recursive make increases the parallel portion of the build by 10%. That may not seem like much, but Amdahl’s law shows how dramatically that difference affects the speedup you can expect as you apply more cores:

Implementation

On the backend, non-blocking recursive make is handled by conflict detection — the jobs from the recursive make are checked for conflicts in the serial order defined by the makefile structure. Any issues caused by aggressively running recursive makes early are detected during the conflict check, and the target that ran too early is rerun to generate the correct result.

On the frontend, emake uses a strategy that is at once both brilliant in its simplicity, and diabolical in its trickery. It starts with an environment variable. When emake is invoked recursively, it checks the value of EMAKE_BUILD_MODE. If it is set to node, emake runs in so-called stub mode: rather than executing the submake (parsing the makefile and building targets), emake captures the invocation context (working directory, command-line and environment) in a file on disk, prints a “magic” string and exits with a zero status code.

The file containing the invocation context is identified by a second environment variable, ECLOUD_RECURSIVE_COMMAND_FILE. The Accelerator agent (which handles invoking commands on behalf of emake) checks for the presence of that file after every command that is run. If it is found, the agent relays the content to the toplevel emake invocation, where a new make instance is created to represent the submake invocation. That instance comes with it’s own parse job of course, which gets inserted into the queue of jobs. Some (short) time later, the parse job will run, discover whatever work must be run by the submake, and create additional rule jobs.

The magic string — EMAKE_FNORD — serves as a placeholder in the stdout stream for the jobs, so emake can figure out which portion of the output text comes before and which portion comes after the submake. This ensures that the build output log is identical to that generated by a serialized gmake build. For example, given the following rule that invokes a submake, you’d expect to see the “Before” and “After” messages printed before and after the output generated by commands in the submake itself:

1
2
3
4
all:
@echo Before util ; \
@$(MAKE) -C util ; \
@echo After util

With non-blocking recursive make, the submake has not actually executed when the “echo After util” command runs. If emake doesn’t account for that reordering, both the “Before” and “After” messages will appear before any of the output from the submake. EMAKE_FNORD allows emake to “stitch” the output together so the build log matches a serial log.

Limitations

Conflict detection and non-blocking recursive make together solve the main problems associated with recursive make. But there are a couple scenarios where non-blocking recursive make does not work well. Fortunately, these are uncommon in practice and easily addressed.

Capturing recursive make stdout

The first scenario is when the build captures the output of the recursive make invocation, rather than letting it print to stdout as normal. Since emake defers the execution of the submake and prints only EMAKE_FNORD to stdout, this will not work. There are two reasons you might do this: first, you might want to have separate build logs for each submake, to simplify error detection and management. In this situation, the simplest workaround is to remove the redirection and instead us emake’s annotated build log, an XML version of the build output log which can be easily processed using standard tools. Second, you may be using make as a text-processing tool (sort of a “poor man’s” Perl), rather than for building per se:

1
2
3
all:
@$(MAKE) -f genlist.mk > objects.txt
@cat objects.txt | xargs rm

In this case, the workaround is to explicitly force emake to run in so-called “local” mode, which means emake will handle the recursive make invocation as a blocking invocation, just like traditional make would. You can force emake into local mode by adding EMAKE_BUILD_MODE=local to the environment before the recursive make invocation.

Immediate consumption of build products

The second scenario is when the build consumes the product of the submake in the same command that contains the invocation. For example:

1
2
all:
@$(MAKE) -C sub foo && cp sub/foo ./foo

Here the build assumes that the output files generated by the submake will be available for use immediately after the submake completes. Obviously this is not the case with non-blocking recursive make — when the invocation of $(MAKE) -C sub foo completes, only the submake stub has actually finished. The build products will not be available until after the submake is actually processed later. Note that in this build both the recursive make invocation and the commands that use the build products from that invocation are treated as a single command from the perspective of make: make actually invokes the shell, and the shell then runs the recursive make and cp commands.

The workaround is simple: split the consumer into a distinct command, from the perspective of make:

1
2
3
all:
@$(MAKE) -C sub foo
@cp sub/foo ./foo

With that trivial change, emake is able to treat the cp as a continuation job, which can be serialized against the completion of the recursive make as needed.

A fix for recursive make

For years, people have heaped scorn and criticism on recursive make. They’ve nearly convinced everybody that even considering its use is automatically wrong — you probably can’t help feeling a little bit guilty when you use recursive make. But the reality is that recursive make is a reasonable way to structure a large build. You just need a better make. With conflict detection and non-blocking recursive make, Electric Make has fixed the problems usually associated with recursive make, so you can get parallel builds that are both fast and correct. Give it a try!

post

What’s new in ElectricAccelerator 6.1

Electric Cloud announced the release of ElectricAccelerator 6.1 a few weeks ago, the 24th feature release of ElectricAccelerator since the company was founded in 2002. This release incorporates several enhancements that make the product more robust and more flexible. Here are the key additions:

Workload reporting

In many large organizations, a single massive Accelerator cluster is shared across multiple development teams, in order to reduce administration overhead and improve hardware utilization — naturally each team has “their” agents, but if those are not in use, they can be easily made available to other teams. Often, the maintenance cost is shared by the teams using the cluster, according to the amount they use the cluster. In order to facilitate this use case, Accelerator 6.1 adds workload reporting. At the conclusion of each build, emake reports the total CPU usage for the build to the cluster manager. The administrator can see a summary of usage by running the “Build Users” report from the cluster manager web interface:

Example build users report; click for full size

Example build users report. Click to view full size.

Multi-interface network support

In data centers it’s not uncommon to find servers configured with two network interfaces: for example, a gigabit connection for communication with systems outside the data center, and a 10 GigE, fiber-optic or Infiniband connection for extremely high-bandwidth communication with other systems inside the data center. Accelerator 6.1 adds explicit support for this configuration, so data transfers between agents in the cluster utilize the high-bandwidth secondary interface for increased performance.

Strong checksums for data integrity

For years Accelerator has relied on the checksum built into the TCP standard to ensure integrity of network data transfers. Unfortunately that checksum is relatively weak, and in rare cases we found it was insufficient to detect data errors introduced by faulty network hardware. In this release, we added an application-layer checksum to further guard against such problems. Accelerator 6.1 uses CRC32-c, chosen for its robust error detection capabilities and high performance.

Expanded platform support

Accelerator 6.1 adds support for a few new platforms and third-party tools, including:

  • Ubuntu 11.04
  • ClearCase 8

What’s next?

It feels great to have wrapped up Accelerator 6.1, but we’re not resting on our laurels. We’ve already done a lot of work for 6.2, which includes some key robustness improvements that have been literally years in the making; and we’re getting started on 7.0, which currently is planned to include some really exciting new performance enhancements around incremental builds — more on that soon.

Accelerator 6.1 is available immediately for current customers. New customers should contact sales@electric-cloud.com.

post

Measuring the Electric File System

Somebody asked me the other day what portion of the Electric File System (EFS) is shared code versus platform-specific code. The EFS, if you don’t know, is a custom filesystem driver and part of ElectricAccelerator. It enables us to virtualize the filesystem, so that each build job sees the correct view of the filesystem according to its logical position in the build, even if jobs are run out of order. It also provides the file usage information that powers our conflict detection algorithm. As a filesystem driver, the EFS is obviously tightly coupled to the platforms it’s used on, and the effort of porting it is one reason we don’t support more platforms than we do now — not that a dozen variants of Windows, 16 flavors of Linux and several versions of Solaris is anything to be ashamed of!

Anyway, the question intrigued me, and I found the answer quite surprising (click for full-size):

Note that here I’m measuring only actual code lines, exclusive of comments and whitespace, as of the upcoming 6.1 release. In total, the Windows version of the EFS has about 2x the lines of code that either the Solaris or Linux ports have. The platform-specific portion of the code is more than 6x greater!

Why is the Windows port so much bigger? To answer that question, I started looking at the historical size of the three ports, which lead to the following graph showing the total lines of code for (almost) every release we’ve made. Unfortunately our first two releases, 1.0 and 2.0, have been lost to the ether at some point over the past ten years, but I was able to collect data for every release starting from 2.1 (click for full-size):

Here you can see that the Windows port has always been larger than the others, but it’s really just a handful of Windows-specific features that blew up the footprint so much. The first of those was support for the Windows FastIO filesystem interface, an alternative “fast path” through the kernel that in theory enables higher throughput to and from the filesystem. It took us two tries to get that feature implemented, as shown on the graph, and all-told that contributed about 7,000 lines of code. The addition of FastIO to the filesystem means that the Windows driver has essentially three I/O interfaces: standard I/O, memory-mapped I/O and FastIO. In comparison, on Linux and Solaris you have only two: standard I/O and memory-mapped I/O.

The second significant difference between the platforms is that on Windows the EFS has to virtualize the registry in addition to the filesystem. In the 4.3 release we significantly enhanced that portion of the driver to allow full versioning of the registry along the same lines that the filesystem has always done. That feature added about 1,000 lines of code.

I marked a couple other points of interest on this graph as well. First, the addition of the “lightweight EFS” feature, which is when we switched from using RAM to store the contents of all files in the EFS to using temporary storage on the local filesystem for large files. That enabled Accelerator to handle significantly larger files, but added a fair amount of code. Finally, in the most recent release you can actually see a small decrease in the total lines of code on Solaris and Linux, which reflects the long-overdue removal of code that was needed to support legacy versions of those platforms (such as the 2.4.x Linux kernel).

I was surprised as well by the low rate of growth in the Solaris and Linux ports. Originally the Linux port supported only one version of the Linux kernel, but today it supports about sixteen. I guess this result reveals that the difficulty in porting to each new Linux version is not so much the amount of code to be added, but in figuring out exactly which few lines to add!

In fact, after the 4.4 release in early 2009, the growth has been relatively slow on all platforms, despite the addition of major new features to Accelerator as a whole over the last several releases. The reason is simply that most feature development involves changes to other components (primarily emake), rather than to the filesystem driver.

One last metric to look at is the number of unit tests for the code in the EFS. We don’t practice test-driven development for the most part, but we do place a strong emphasis on unit testing. Developers are expected to write unit tests for every change, and we strive to keep the tests isomorphic to the code to help ensure we have good coverage. Here’s how the total number of unit tests has grown over the years (click for full-size):

Note that this is looking only at unit tests for the EFS! We have thousands more unit tests for the other components of Accelerator, as well as thousands of integration tests.

Thankfully for my credibility, the growth in number of tests roughly matches the growth in lines of code! But I’m surprised that the ratio of tests to code is not more consistent across the platforms — that suggests a possible area for improvement. Rather than being discouraged by that though, I’m pleased with this result. After all, what’s the point of measuring something if you don’t use that data to improve?

post

Another confusing conflict in ElectricAccelerator

After solving the case of the confounding conflict, my user came back with another scenario where ElectricAccelerator produced an unexpected (to him) conflict:

1
2
3
4
5
6
all:
@$(MAKE) foo
@cp foo bar
foo:
@sleep 2 && echo hello world > foo

If you run this build without a history file, using at least two agents, you will see a conflict on the continuation job that executes the cp foo bar command, because that job is allowed to run before the job that creates foo in the recursive make invocation. After one run of course, emake records the dependency in history, so later builds don’t make the same mistake.

This situation is a bit different from the symlink conflict I showed you previously. In that case, it was not obvious what caused the usage that triggered the conflict (the GNU make stat cache). In this case, it’s readily apparent: the continuation job reads (or attempts to read) foo before foo has been created. That’s pretty much a text-book example of the sort of thing that causes conflicts.

What’s surprising in this example is that the continuation job is not automatically serialized with the recursive make that precedes it. In a very real sense, a continuation job is an artificial construct that we created for bookkeeping reasons internal to the implementation of emake. Logically we know that the commands in the continuation job should follow the commands in the recursive make. In fact it would be absolutely trivial for emake to just go ahead and stick in a dependency to ensure that the continuation is not allowed to start until after the recursive make finishes, thereby avoiding this conflict even when you have no history file.

Given a choice between two strategies that both produce correct output, emake uses the strategy that produces the best performance in the general case.

Absolutely trivial to do, yes — but also absolutely wrong. Not for correctness reasons, this time, but for performance. Remember, emake is all about maximizing performance across a broad range of builds. Given a choice between two strategies that both produce correct output, emake uses the strategy that produces the best performance in the general case. For continuation jobs, that means not automatically serializing the continuation against the preceding recursive make. I could give you a wordy, theoretical explanation, but it’s easier to just show you. Suppose that your makefile looked like this instead of the original — the difference here is that the continuation job itself launches another recursive make, rather than just doing a simple cp:

1
2
3
4
5
6
7
8
9
all:
@$(MAKE) foo
@$(MAKE) bar
foo:
@sleep 2 && echo hello world > foo
bar:
@sleep 2 && echo goodbye > bar

Hopefully you agree that the ideal execution of this build would have both foo and bar running in parallel. Forcing the continuation job to be serialized with the preceding recursive make would choke the performance of this build. And just in case you’re thinking that emake could be really clever by looking at the commands to be executed in the continuation job, and only serializing “when it needs to”: it can’t. First, that would require emake to implement an entire shell syntax parser (or several, really, since you can override SHELL in your makefile). Second, even if emake had that ability, it would be thwarted the instant the command is something like my_custom_script.pl — there’s no way to tell what will happen when that gets invoked. It could be a simple filesystem access. It could be a recursive make. It could be a whole series of recursive makes. Even when the command is something you think you recognize, can emake really be sure? Maybe cp is not our trustworthy standard Unix cp, but something else entirely.

Again, all is not lost for this user. If you want to avoid this conflict, you have a couple options:

  1. Use a good history file from a previous build. This is the simplest solution. You’ll only get conflicts in this build if you run without a history file.
  2. Refactor the makefile. You can explicitly describe the dependency between the commands in the continuation job and the recursive make by refactoring the makefile so that the stuff in the continuation is instead its own target, thus taking the decision out of emake’s hands. Here’s one way to do that:
    1
    2
    3
    4
    5
    6
    7
    8
    all: do_foo
    @cp foo bar
    do_foo:
    @$(MAKE) foo
    foo:
    @sleep 2 && echo hello world > foo

Either of these will eliminate the conflict from your build.

post

ElectricAccelerator and the Case of the Confounding Conflict

A user recently asked me why ElectricAccelerator reports a conflict in this simple build, when executed without a history file from a previous run:

1
2
3
4
5
6
7
all: foo symlink_to_foo
foo:
@sleep 2 && echo hello world > foo
symlink_to_foo:
@ln -s foo symlink_to_foo

Specifically, if you have at least two agents, emake will report a conflict between symlink_to_foo and foo, indicating that symlink_to_foo somehow read or otherwise accessed foo during execution! But ln does not access the target of a symlink when creating the symlink — in fact, you can even create a symlink to a non-existent file if you like. It seems obvious that there should be no conflict. What’s going on?

To understand why this conflict occurs, you have to wrap your head around two things. First, there’s more going on during a gmake-driven build than just the commands you see gmake invoke. That causes the usage that provokes the conflict. Second, emake considers a serial gmake build the “gold standard” — if a serial gmake build produces a particular result, so too must emake. That’s why the additional usage must result in a conflict.

In this case, the usage that triggers the conflict comes from management of the gmake stat cache. This is a gmake feature that was added to improve performance by avoiding redundant calls to stat() — once you’ve stat()‘d a file once, you don’t need to do it again. Unless the file is changed of course, which happens quite a lot during a build. To keep the stat cache up-to-date as the build progresses, gmake re-stat()‘s each target after it finishes running the commands for the target. So after the commands for symlink_to_foo complete, gmake stat()‘s symlink_to_foo again, using the standard stat() system call, which follows the symlink (in contrast to lstat(), which does not follow the symlink). That means gmake will actually cache the attributes of foo for symlink_to_foo.

To ensure compatibility with gmake, emake has to do the same. In Accelerator parlance, that means we get read usage on symlink_to_foo (because you have to read the symlink itself to determine the target of the symlink), and lookup usage on foo. The lookup on foo causes the conflict, because, of course, you will get a different result if you lookup foo before the job that creates it than you would get if you do the lookup after that job. Before the job, you’ll find that foo does not exist, obviously; after, you’ll find that it does.

But what difference does that make, really? In truth, if there’s no detectable difference in behavior, then it doesn’t matter at all. And in the example build there is no detectable difference — the build output is the same regardless of when exactly you stat() symlink_to_foo relative to when foo is created. But with a small modification to the build, it is suddenly becomes possible to see the impact:

1
2
3
4
5
6
7
8
9
10
all: foo symlink_to_foo reader
foo:
@sleep 2 && echo hello world > foo
symlink_to_foo:
@ln -s foo symlink_to_foo
reader: foo symlink_to_foo
@echo newer prereqs are: $?

Compare the output when this build is run serially with the output when the build is run in parallel — and note that I’m using gmake, so you can be certain I’m not trying to trick you with some peculiarity of emake’s implementation:

You can plainly see the difference: in the parallel build gmake stat()‘s symlink_to_foo before foo exists, so the stat cache records symlink_to_foo as non-existent. Then when gmake generates the value of $? for reader, symlink_to_foo is excluded, because non-existent files are never considered newer than existing files. In the serial build, gmake stat()‘s symlink_to_foo after foo has been created, so the stat cache indicates that symlink_to_foo exists and is newer than reader, so it is included in $?.

Hopefully you see now both what causes the conflict, and why it is necessary. The conflict occurs because of lookup usage generated when updating the stat cache. The conflict is necessary to ensure that the build output matches that produced by a serial gmake — the “gold standard” for build correctness. If no conflict is declared, there is the possibility for a detectable difference in build output compared to serial gmake.

However, you might be thinking that although it makes sense to treat this as a conflict in the general case, isn’t it possible to do something smarter in this specific case? After all, the orignal example build does not use $?, and without that there isn’t any detectable difference in the build output. So why not skip the conflict?

The answer is simple, if a bit disappointing. In theory it may be possible to elide the conflict by checking to see if the symlink is used by a later job in a manner that would produce a detectable difference (for example, by scanning the commands for subsequent targets for references to $?), but in reality the logistics of that check are daunting, and I’m not confident that we could guarantee correct behavior in all cases.

Fortunately all is not lost. If you wish to avoid this conflict, you have several options:

  1. Use a good history file from a previous build. This is the most obvious solution. You’ll only get conflicts if you run without a history file.
  2. Add an explicit dependency. If you make foo an explicit prereq of symlink_to_foo, then you will avoid the conflict. Here’s how that would look:
    1
    symlink_to_foo: foo
  3. Change the serial order. If you reorder the makefile so that symlink_to_foo has an earlier serial order than foo you will avoid the conflict. That just requires a reordering of the prereqs of all:
    1
    all: symlink_to_foo foo

Any one of these will eliminate the conflict from your build, and you’ll enjoy fast and correct parallel builds.

Case closed.

post

ElectricMake debug log levels

Often when analyzing builds executed with ElectricMake, all the information you need is in the annotation file — an easily digested XML file containing data such as the relationships between the jobs, the commands run, and the timing of each job. But sometimes you need more detail, and that’s where the emake debug log comes in.

To enable emake debug logging, you specify a pair of command-line arguments: ––emake-debug=value, which specifies the types of debug logging to enable as a set of single-letter values, such as “jng”; and ––emake-logfile=path, which specifies the location of the debug log. In this article I’ll explain each of emake’s debug log levels. Use this index to jump to the definition of a specific log level:

DISCLAIMER: emake debug logs are intended for use by Electric Cloud engineering and support staff. Debug logging contents and availability are subject to change in any release, for any or no reason. Enter at your own risk, your mileage may vary, etc. etc. The information in this article refers to ElectricAccelerator 6.0.

a: agent allocation

Agent allocation logging provides detailed information about emake’s attempts to procure agents from the cluster manager during the build. If you think emake may be stalled trying to acquire agents, allocation logging will help to understand what’s happening.

c: cache

Cache logging records details about the filesystem cache used by emake to accelerate parse jobs in cluster builds. For example, it logs when a directory’s contents are added to the cache, and the result of lookups in the cache. Since it is only used during remote parse jobs, you’ll have to use it with the ––emake-rdebug=value option. Use cache logging if you suspect a problem with the cached local filesystem.

e: environment

Environment logging augments node logging with a dump of the entire environment block for every job as it is sent to an agent. Normally this is omitted because it’s quite verbose (could be as much as 32KB per job). Usually you’re better off using env-level annotation, which is more compact and easier to parse.

f: filesystem

Filesystem logging records numerous details about emake’s interaction with its versioned filesystem data structure. In particular, it logs every time that emake looks up a file (when doing up-to-date checks, for example), and it logs every update to the versioned file system caused by file usage during the build’s execution. This level of logging is very verbose, so you shouldn’t enable it as a general rule. It’s most often used when diagnosing issues related to the versioned filesystem and conflicts.

g: profiling

Profiling logging is one of the easiest to interpret and most useful types of debug logging. When enabled, emake will emit hundreds of performance metrics at the end of the build. This is a very lightweight logging level, and is safe (even advisable) to enable for all builds.

h: history

History logging prints messages related to the data tracked in the emake history file — both filesystem dependencies and autodep information. When history logging is enabled, emake will print a message every time a dependency is added to the history file, and it will print information about the files checked during up-to-date checks based on autodep data. Enable history logging if you suspect a problem with autodep behavior.

j: job

Job logging prints minimal messages related to the creation and execution of jobs. For each job you’ll see a message when it starts running, when it finishes running, and when emake checks the job for conflicts. If there is a conflict in the job, you’ll see a message about that too. If you just want a general overview of how the build is progressing, j-level logging is a good choice.

L: nmake lexer

emake uses a generated parser to process portions of nmake makefiles. Lexer debug logging enables the debug logging in that generated code. This is generally not useful to end users as it is too low-level.

l: ledger

Ledger debug logging prints information about build decisions based on data in the ledger file, as well as updates made to the ledger file. Enable it if you believe the ledger is not functioning correctly.

m: memory

When memory logging is enabled, emake will print memory usage metrics to the debug log once per second. This includes the total process memory usage as well as current and peak memory usage grouped into several “buckets” corresponding to various types of data in emake. For example, the “Operation” bucket indicates the amount of memory used to store file operations; the “Variable” bucket is the amount of memory used for makefile variables. This is most useful when you are experiencing an out-of-memory failure in emake, as it can provide guidance as to how memory is being utilitized during the build, and how quickly it is growing.

n: node

Node logging prints detailed information about all messages between emake and the agents, including filesystem data and commands executed. Together with job logging, this can give a very comprehensive picture of the behavior of a build. However, node logging is extremely verbose, so you should enable it only when you are chasing a specific problem.

o: parse output

When parse output logging is enabled, emake will preserve the raw result of parsing a makefile. The result is a binary file containing information about all the targets, rules, dependencies and variables extracted from makefiles read during a parse job. This can be useful when investigating parser incompatibility issues and scheduling issues (for example, if a rule is not being scheduled for execution when you expect). Note that this debug level only makes sense when parsing, which means you have to specify it in the ––emake-rdebug option. The parse results will be saved in the ––emake-rlogdir directory, named as parse_jobid.out. Note that the directory may be on the local disk of the remote nodes, depending on the value you specify!

p: parse

Parse debug logging prints extremely detailed information about the reading and interpretation of makefiles during a parse job. This is most useful when investigating parser compatibility issues. This output is very verbose, so you should only enable this when pursuing a specific problem. Note that like parse output logging, this debug level only makes sense during parsing, which means you have to specify it in the ––emake-rdebug option. The parse log files will be saved in the ––emake-rlogdir directory, named as parse_jobid.dlog. Note that the directory may be on the local disk of the remote nodes, depending on the value you specify!

r: parse relocation

Parse relocation logging prints low-level information about the process of transmitting parse result data to emake at the end of a parse job. It’s only used internally when we are extending the parse result format, and so is unlikely to be of interest to end users.

s: subbuild

Subbuild logging prints details about decisions made while using the emake subbuild feature. You should enable it if you believe that the subbuild feature is not working correctly.

Y: authentication

Authentication logging is a subset of node logging that prints only those messages related to authenticating emake to agents and vice-versa. If you are having problems using the authentication feature, you should enable this debug level.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: