post

The ElectricAccelerator 7.0 “Ship It!” Award

With ElectricAccelerator 7.0 out the door, it’s finally time for the moment you’ve all been waiting for: the unveiling of the Accelerator 7.0 “Ship It!” award. This time I picked the Clockwork Android, in light of our emphasis on Android build performance. Here’s the trading card that accompanied the figure:

BEEP BOP BOOP

BEEP BOP BOOP

metrics metrics metrics metrics

metrics metrics metrics metrics

As with the 6.2 award, I included some metrics about the release:

  • Number of days in development. This release was relatively long compared to our other releases — not quite our longest development cycle, but close. That’s partly because this release encompassed the Thanksgiving and Christmas seasons, which typically costs us 3-4 weeks of development and testing time. We also deliberately pushed out the release date about 2 weeks to incorporate feedback from beta testers.
  • JIRA issues closed. We resolved 185 issues in this release. That’s double what we had in 6.2, and it includes some really cool new features.
  • Performance improvement. Since this release was all about performance, it made sense to include the data that proves our success. I had some trouble finding a good way to visualize the improvement, but I’m happy with the finished product.

Of course, none of the achievements in Accelerator 7.0 would have been possible without the hard work and dedication of the incredibly talented Accelerator team. Thank you all!

post

“Playing” with agile

Recently we invited a Scrum coach to Electric Cloud to teach us how to get started with the Scrum model of agile development. On the first day we played a game intended to introduce us to the core elements of Scrum: plan, do, inspect, adapt (or “plan, do, check, act”; or “the Deming cycle”). Without getting into a deeper discussion of Scrum itself, I thought I would share my team’s performance in this fun little game. If you’re familiar with ElectricAccelerator, our game strategy will come as no surprise: it exploits parallel processing and horizontal scalability to improve performance.

The game was simple: we were given a bucket of rubber bouncy balls and instructed to pass balls from person to person, until every member of the team had touched the ball. For each ball that completed the circuit we earned one point; for each drop we were penalized three points. A few rules made the game more interesting. First, it was forbidden for two people to touch a ball at the same time — there had to be “air time” between individuals. Second, we could not pass balls to the person directly to our left or to our right. Finally, there was a time limit (just like a sprint): we had only 2 minutes to pass balls in each round.

At the start of the game, we were given 5 minutes to plan our strategy and make a prediction of how many balls we would pass. Between each round we had 3 minutes more to modify our strategy based on our experience in the previous round and make a new prediction for the next round. If you are familiar with Scrum you’ll recognize the analogy to story points.

In total we had 12 players plus one scribe (me) that was tasked with counting the number of balls passed and dropped.

Round 1 (plan: 0; actual: 29)

Our first planning phase was best described as chaotic. It wasn’t actually clear who was on our team or not, due to some stragglers to the activity. We weren’t sure about the constraints. Everybody had ideas about how best to pass the balls, so everybody was talking at once. It seemed simple, but in fact we had barely gotten everybody in place when the 5 minute prep time elapsed. We did manage to agree on the three key elements of our strategy though:

  • Dropping balls into the cupped hands of the receiver, rather than throwing them, to minimize the risk of dropping balls.
  • Two rings of players, one inner and one outer, facing each other. Balls would be passed in a zig-zag between the rings.
  • Parallel passing. Everybody would be either passing or receiving at all times.

This diagram shows the positions of the players, as well as which players had a ball at the start of the round:

Scrum game, round 1

As you can see, we had too many balls “in play” when we started, given our strategy — a direct consequence of unclear communication during the planning phase. The surplus balls were dropped almost immediately. Our final score for this round was 29: 1 point for each of 35 balls passed, minus 6 points for 2 balls dropped.

Round 2 (plan: 50; actual: 72)

Round 1 demonstrated that our core strategy was sound, but to improve performance we decided to make a couple tweaks. First, we made certain that we were in agreement about which players would start with balls: only those in the outside ring. Second, we realized we could improve throughput by passing two balls at a time, instead of just one. With our drop-into-cupped-hands strategy this was hardly more risky than one ball at a time. We predicted that we would pass 50 balls, about 60% more than we did in round 1. Here’s the updated diagram showing the starting positions of the players and balls for round 2:

Scrum game, round 2

Our score in round 2 was 72: 72 balls passed, with zero dropped.

Round 3 (plan: 120; actual: 60)

At this point we believed we had everything worked out. We increased the balls-per-pass to three and predicted that this would result in about 120 balls passed. But during the planning phase one of our players abruptly left — to be honest I’m not even sure who it was or why they stepped away. All I know is that suddenly we had only 11 players instead of 12, which left us with 6 on the outer ring but only 5 on the inner ring. We didn’t realize the problem until people started lining up in position near the end of the planning period. With the clock ticking we made an exceptionally poor decision about how to handle the mismatch: one of the inner ring players would serve as receiver for two of the outer ring players. First they would receive from player A, then pass to player B; then immediately receive the balls back from player B before sending them on to player C. Sounds complicated, right? It was. Here’s the updated diagram:

Scrum game, round 3

This proved was disastrous for our performance. At speed, it was (understandably) hard for the player serving double-duty to efficiently execute the elaborate sequence of exchanges. In addition, we were careless when we grabbed the extra balls we needed: although most were consistently round, a few were those oddly shaped rubber rocks which move in unpredictable ways. These misshapen lumps of rubber are just a bit harder to catch than regular balls, and that slowed us down. Our final score in this round was just 57: 60 balls passed, one dropped.

Round 4 (plan: 120; actual: 123)

The obvious problem in round three was the mismatch in the sizes of the inner and outer rings. The solution was obvious too: remove one player from the outer ring to restore equilibrium. There was just one problem. According to the rules of the game, a ball had to be touched by every player in order to count as having been passed. What could we do? We pled our case to the coach, who agreed to let us have one person sit out this round — a demonstration of another fact of agile development: sometimes a team can be made more productive by having fewer people on it. With 5 players on each ring, we again predicted that we would pass 120 balls. Here’s how the layout looked for this round:

Scrum game, round 4

This was our best round yet with a final score of 123: 135 balls passed, with only four drops.

Review

Overall I was really pleased with our performance in this game — granted, the point of the exercise was not actually to see how many balls we could pass around, but to experience the plan-do-inspect-adapt cycle directly. And we certainly did that too. But come on! How can you not be excited by a more than 4x improvement in throughput from round 1 to round 4? I’m not surprised though. After all, speed is the name of the game for ElectricAccelerator. This is what we do. That we got there by applying the same strategies to this game that we use in Accelerator itself — icing on the cake.

Later that night I realized an error in our execution on round 4 though. We chose to even out the rings by dropping one player from the outer ring, when we could just as easily have added a player to the inner ring: me. As scribe, I did not actively participate in the ball passing, only the planning and review. But there was no particular reason I couldn’t have stepped in. That would have increased our throughput by 20% (by increasing the number of balls in play from 15 to 18). I think we could have exceeded 150 balls passed with that configuration. So in the end, the game was a great demonstration of what is probably the most important concept from Scrum: there’s always room for improvement.

post

What is the fastest way to find non-zero bits in an MD5 hash?

Microbenchmarks, as a general rule, are a waste of time. Let’s just get that out of the way up front. They are also, as a general rule, totally inaccurate, measuring the execution time of some snippet of code in a context that is completely divorced from the reality in which that code will actually be used. So if after reading this article you think, “I should tell Eric what a waste of time this was!” — don’t bother. I already know.

But… microbenchmarks are also fun, and sometimes interesting, and often vastly easier to implement than a real benchmark of the same code in a production system. So a couple weeks ago, when my colleague proposed using an MD5 hash with value zero as a sentinal indicating that the checksum had not yet been calculated, I wondered: what is the fastest way to test if an MD5 hash has any non-zero bits? I had some time to kill so I wrote a microbenchmark comparing several implementations. The results are presented here for your amusement and edification.

The benchmark

The goal of the benchmark is to determine which of several methods can most quickly determine whether an MD5 hash is all zeroes. An MD5 hash is 128 bits long, so in essence this problem boils down to simply checking for non-zero bits in an arbitrary sequence of 16 bytes. You can find the benchmark source code in my Github repo.

For sample data I simply allocated about 100,000 17 byte arrays, then set one byte in each to a non-zero value. This structure made it wasy to easily test the effect of memory alignment on performance, by using either the first 16 or the last 16 bytes of each 17 byte array as the value under test. The total size was significant as well: smaller than the L1 cache on a typical modern CPU, so we avoid measuring memory bandwidth performance.

I tested the following methods for determining whether a hash is all zero, in both aligned and unaligned varieties:

  1. Naive loop over bytes: the most obvious approach simply loops over the bytes, testing each in turn.
  2. Unrolled loop over bytes: loop unrolling is a common optimization for loops with a fixed number of iterations.
  3. Bitwise OR of bytes: OR all bytes together, then compare the result to zero.
  4. Slice by four: treat the 16 bytes as an array of four 32-bit integers, testing each for equality with zero.
  5. Slice by eight: treat the 16 bytes as an array of two 64-bit integers, testing each for equality with zero.
  6. Find first set: use the GCC builtin function __builtin_ffs, which finds the first non-zero bit in a 32-bit integer.

The code was compiled to 64-bit binaries using GCC 4.7.2 with -O2 optimization and no debugging symbols.

The results

I ran the benchmarks on two systems. First I tried a quad-core hyperthreaded 1.73GHz Intel Core i7 laptop with 16GB RAM and an SSD hard drive. All cores were put into performance mode to ensure no CPU frequency scaling was enabled, using (for example) echo performance | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. Here are the results (longer is better):

Comparison of strategies for finding zero-valued MD5 hashes (Intel)

The results are a bit erratic, to be sure — for example, it makes no sense that the unaligned version of the unrolled naive loop should be faster than the aligned version. This is likely just because the operation being measured is so fast that it’s hard to get a “pure” measurement: even tiny fluctuations in system load perturb the tests. You’ll see that if you run the benchmark a few times, you’ll get slightly different results each time. That just means that we shouldn’t make any hard-and-fast decisions based on the exact numbers here.

Nevertheless, the difference between slice-by-four or slice-by-eight and the other strategies is substantial enough that I trust the overall result, if not the exact numbers. That is, slice-by-four and slice-by-eight are clearly significantly faster than any other approach. But — and here’s where we discover the degree of yak shaving we’ve been up to — even the slowest strategy is still pretty damn fast. In all honesty, it is not going to make a lick of difference in overall application performance, unless you really do need to do billions of these checks. A realistic upper bound for my application is maybe ten million, which would consume a tiny fraction of a second using even the naive loop.

One final surprise in this data is the difference in performance between aligned and unaligned memory access — or rather, the lack therof. Conventional wisdom is that you pay a performance penalty for accessing unaligned memory, at least when you try to treat it as 32- or 64-bit blocks. In fact, this result supports other tests which indicate that on the Intel Core i7 there is effectively no penalty for unaligned memory access.

If you’re only working on x86 architectures, you may consider this exercise concluded. But we actually run our software on SPARC architecture as well, so before committing to an implementation let’s take a look at how the benchmark behaves there. This time I used a 1GHz SPARCv9 CPU:

Comparison of strategies for finding zero-valued MD5 hashes (SPARC)

Slice-by-four and slice-by-eight are fastest here too, as long as the data is aligned. If not — BOOM! The application crashes with a bus error, because the SPARC architecture is actually quite sensitive to data alignment. If you want to treat a piece of memory as an integer, it had better be properly aligned.

Conclusion

Informed by these results, we opted to use the slice-by-four strategy. That required a modification of our code, which previously did not guarantee alignment of MD5 hashes. Fortunately that modification was trivial, so it cost us little time and did not make the code any less clear. But you can see hints of the real danger of microbenchmarks: it’s often difficult for a developer to ignore the existence of a faster-but-more-complex strategy, despite evidence that the simple implementation is more than adequately performant. In this case the cost of enabling the faster implementation was negligable, but I’ve seen developers (including myself) needlessly contort code in the name of performance, doggedly defending their choices with microbenchmarks like these. Don’t let yourself become another statistic: use microbenchmarks, sure, but always evaluate the results in the larger context of overall application performance.

With that, I invite you to sound off in the comments: what did I overlook in my microbenchmark? How were the tests flawed? What other strategies do you know for testing whether a series of 16 bytes contains any non-zero bits?

post

6 tips for writing robust, maintainable unit tests

Unit testing is one of the cornerstones of modern software development, but there’s a surprising lack of advice about how to write good unit tests. That’s a shame, because bad unit tests are worse than no unit tests at all. Over the past decade at Electric Cloud, I’ve written thousands of tests — a full build across all platforms runs over 100,000 tests. In this post I’ll share some tips for writing robust, maintainable tests. I learned these the hard way, but hopefully you can learn from my mistakes.

A key focus here is on eliminating so-called “flaky tests”: those that work almost all the time, but fail once-in-a-great-while for reasons unrelated to the code under test. Such unreliable tests erode confidence in the test suite and even in the value of unit testing itself. In the worst case, a history of failures due to flaky tests can cause people to ignore sporadic-but-genuine test failures, allowing rarely seen but legitimate bugs into the wild.

If that’s not enough to convince you to eliminate flaky tests, consider this: suppose each of your flaky tests fails just one time out of every 10,000 runs, and that you have a thousand such tests overall. At that rate, about 10% of your CI builds will fail due to flaky tests.

Now that I’ve got your attention, let’s see how to write better tests. Got some tips of your own? Add them in the comments!

1. Never use hardcoded network port numbers

The software I write involves network communication, which means that many of the tests create network sockets. In pseudo-code, a test for the server component looks something like this:

  1. Cleanup the previous server instance (if any).
  2. Instantiate the server on port 6003.
  3. Connect to the server via port 6003.
  4. Send some message to the server via the socket.
  5. Assert that the response received is correct.

At first glance this seems pretty safe: no standard service uses port number 6003, so there won’t be any contention with third parties for that port, and by front-loading the cleanup of the server we ensure that there won’t be any contention with our other tests for the port. And, of course, it (seems to) work! In fact, it probably works almost every time you run it. But rest assured: one day, seemingly at random, this test will fail.

For us the failures started after literally years and tens of thousands of executions. We never saw the same test fail twice, but the failure mode was always the same: “Only one usage of each socket address (protocol/network address/port) is normally permitted.” I wish I could say with certainty why the failures started happening — I suspect that our anti-virus software transiently grabs a dynamic port, and sometimes it just happens to grab the port that we planned to use.

Fortunately the solution is simple: instead of using a hardcoded port number, use a dynamic port assigned by the operating system. Usually that means binding to port number zero. This is slightly less convenient than a hardcoded port number, because you’ll have to query the socket after its bound to determine the actual port number, but that’s a small price to pay to be confident you’ll never have a spurious test failure due to this mistake.

2. Never use “sleep()” to synchronize test threads.

A common mistake when writing tests that involve multiple threads is to attempt to use sleep() to synchronize events in different threads. For example, you may have a server thread which opens a socket and waits for a connection. You want to test the behavior of the server thread, but you have to be sure that the socket is opened and ready to accept connections before you try to connect to it. A quick-and-dirty approach might look like this:

  1. Start server thread.
  2. Sleep a bit to give the thread time to get started.
  3. Open a connection to the server port.

There are some problems with this strategy. If you set the delay too short, occasionally the test will fail because the server socket won’t be ready — when you try to open the client-side connection you’ll get a connection refused error. Conversely, you can make the test fairly reliable by setting the delay very long — multiple seconds — but then the test will take at least that long to run, every time you run it.

Instead you should use a condition variable to synchronize the two threads. If that is not practical or not possible, a retry loop is a decent alternative. In that case, make sure that your loop doesn’t retry so frantically that it starves the other thread of CPU cycles to make progress. I like to use an exponential backoff, so initially I get a few retries at very short intervals, then the intervals become progressively longer to give the other thread more time to work:

1
2
3
4
5
6
7
delay = 20
total = 0
while total < 5000:
# break if socket can be opened, else sleep/retry
time.sleep(delay / 1000.0)
total += delay
delay += delay

3. Never rely on timing data to verify behavior.

Another mistake is using the observed duration of an operation to assert that a particular behavior has been implemented. For example, to test that a socket read operation times out if no data is received within a set period of time, you might write test code like this:

  1. Set socket timeout to 200ms
  2. Mark start time
  3. Attempt to read from socket
  4. Mark end time
  5. Assert that end minus start is between 200ms and 250ms

The problem is that your test runner might get put to sleep between step 2 and 3, and again between step 3 and step 4 — there’s just no way to control what else might be happening on the computer executing the tests. That means that the delta between the times could be significantly higher than the 200ms you expect. Depending on your hardware and timers, it could even be less than 200ms, despite your code being implemented correctly!

There are at least two robust alternatives to implement this test. The first is to add a counter to the code under test, to be incremented whenever the read operation times out. In the test you could check the value before and after the attempted read, and reasonably expect the value to increased by exactly one. The second approach is to forgo the explicit validation altogether — if the test completes, then the timeout must be operating correctly. If the test hangs, then the timeout must not be operating.

4. Never rely on implicit file timestamps.

If your software is sensitive to file timestamps, such as for determining whether file X is newer than file Y, then avoid trusting implicit timestamps in your tests. For example, you might write test code like this:

  1. Create “old” file Y.
  2. Create “new” file X.
  3. Verify that the unit under test behaves correctly given that X is newer than y.

Superficially this looks sound: X is created after Y, so it should have a timestamp later than Y. Unless, of course, it doesn’t, which can happen for a variety of odd reasons. For example, the system clock might get adjusted underneath your feet, due to NTP or another time-synchronization system. I’ve even seen cases where the file creations happen so quickly that the two files effectively have identical timestamps!

The fix is simple: explicitly set the timestamps on the files to ensure that the relationship is as expected. Generally you should specify a difference of at least 2 full seconds, to accommodate filesystems that have very low resolution timestamps. You should also avoid relying on subsecond timestamp resolution, again to accommodate filesystems that don’t support very high-resolution timestamps.

5. Include diagnostic information in test assertions.

You can save yourself a lot of trouble by arranging your test assertions so that they provide detailed information about any failures, rather than simple telling you yes or no, the assertion passed. For example, here’s an unhelpful assertion:

1
CPPUNIT_ASSERT(errors.empty());

If the errors variable contains error text, the assertion will fail — but you’ll have no idea what the errors were, and thus you’ll have no idea what went wrong in the test. In contrast, here’s an informative assertion:

1
CPPUNIT_ASSERT_EQUAL(string(""), errors);

Now if the assertion fails the test harness will show you value of errors, so you’ll have some useful information to start your debugging.

6. Include an explanation of the test in the comments.

Finally, don’t forget to put comments in your test code! Explain how the test works, and why you believe the test actually exercises the feature that you think it tests. It may seem a bit tedious, but remember that the rest of your team may not be as familiar with the code as you are, and they may not know what steps are needed to elicit a particular response from the code. For that matter, in a few months or years you may not remember how to test what you’re trying to test. Such comments are invaluable when updating tests after a refactoring, to understand how the test should be adjusted, as well as when debugging — to understand why a test failed to expose a defect. Here’s an example:

1
2
3
4
5
# Test methodology: create a Foo object, then try to set
# the Froznitz attribute to 5. This should produce an error
# because 5 is not a valid Froznitz value. See that the right
# type of exception is thrown, and that the error text is
# correct.

Summary

Everybody knows it’s important to write unit tests. Following the suggestions here will help make sure that your tests are reliable and maintainable. If you have tips of your own, add them in the comments!

post

The ElectricAccelerator 6.2 “Ship It!” Award

Obviously with ElectricAccelerator 6.2 out the door, it’s time for a new “Ship It!” award. I picked the mechanic figure for this release because the main thrust of the release was to add some long-desired robustness improvements. Here’s the trading card that accompanied the figure:

Greased Lighting — it’s electrifyin’!

Loaded with metrics and analysis goodness!

As promised this iteration of the award includes some metrics comparing this release to previous releases:

  • Number of days in development. We spent 112 days working on the 6.2 release; the range of all feature releases is 80 days on the low end to 249 days at the high end.
  • JIRA issues closed. We closed out 92 issues with this release, including both defects and enhancement requests. The fewest we’ve done was 9 issues; the most was 740 issues.
  • Composition of issues. Of the 92 issues, about 55% were classified as “defects”, and the remaining 45% were “features” of varying magnitude.

Again, a big “Thank You!” goes out to the ElectricAccelerator team! I’m really excited to be working with such a talented group, and I can’t wait to show the world what we’re doing next!

post

The ElectricAccelerator 6.1 “Ship It!” Award

Having shipped ElectricAccelerator 6.1, I thought you might like to see the LEGO-based “Ship It!” award that I gave each member of the development team. I started this tradition with the 6.0 release last fall. Here’s the baseball card that accompanied the detective minifig I chose for this release:

The great detective is on the case!

The Accelerator 6.1 team

I picked the detective minifig for the 6.1 release in recognition of the significant improvements to Accelerator’s diagnostic capabilities (like cyclic redundancy checks to detect faulty networks, and MD5 checksums to detect faulty disks). Compared to the 6.0 award not much has changed in the design, although I did get my hands on the “official” corporate font this time. It strikes me that there’s a lot of wasted space on the back of the card though. Next time I’ll make better use of the space by incorporating statistics about the release. I actually have the design all ready to go, but you’ll have to wait until after the release to see it. Don’t fret though, the 6.2 release is expected soon!

post

Makefile hacks: automatically split long command lines

If you’ve worked on a large build system you’ve probably bumped into this error, or one like this:

gmake: execvp: /bin/sh: Argument list too long

This error means the length of some command-line in your makefile has grown past the system limit, which is typically in the 32 to 256 kilobyte range. It’s surprisingly easy to hit that limit. You start with a small list of object files to be linked together. Over time you add more, and the command-line gets a little longer. Add a few more and it gets longer still. Before you know it you have a monster command-line and your build starts failing.

The solution to this problem is simple: split the long command-line into several shorter command-lines. For example, ar r libraries/lib.a objects/foo.o objects/bar.o objects/baz.o objects/boo.o objects/bang.o becomes something like this:

ar r libraries/lib.a objects/foo.o objects/bar.o
ar r libraries/lib.a objects/baz.o objects/boo.o
ar r libraries/lib.a objects/bang.o

Simple in theory, but tedious to do by hand. And doing it manually is like putting a ticking time-bomb into your makefile — it’s only a matter of time before your build grows enough that you have to go through this exercise again.

I recently ran across a clever solution that exploits the $(eval) function in GNU make to split long command-lines automatically, eliminating the tedium and the time-bomb. After I show you the solution, I’ll explain it piece-by-piece.

The max_args function

The solution is a user-defined function called max_args that splits long command-lines into equal-length chunks:

1
2
3
4
5
6
7
8
9
define max_args
$(eval _args:=)
$(foreach obj,$3,$(eval _args+=$(obj))$(if $(word $2,$(_args)),$1$(_args)$(EOL)$(eval _args:=)))
$(if $(_args),$1$(_args))
endef
define EOL
endef

And an example of its use:

1
2
3
OBJS:=a b c d e f g h
all:
@$(call max_args,echo,2,$(OBJS))

The max_args function takes three parameters: the base command-line, the number of arguments per “chunk”, and the complete list of arguments. It expands to a series of command-lines — one for each chunk of arguments.

The trick behind max_args is the use of $(eval) to update a variable as a side-effect of gmake’s regular variable expansion activity. If you’re not familiar with gmake variable expansion, here’s a quick rundown: when gmake finds a variable or function reference, like $(something), it replace the entire reference with an expanded value. In the case of a variable that’s just the value of the variable. Most variables in gmake are recursive which means that if the variable value itself contains embedded variable references, those will be expanded as well, recursively. In the case of a function, gmake evaluates the function, and replaces the reference with the computed value.

The meat of max_args is on line 3. It starts with the $(foreach) function, which evaluates its third argument, the body of the loop, once for each word in its second argument — in this case, the list of objects passed in the call to max_args.

In max_args, the loop body has two components. The first is a call to $(eval), which simply appends the current value of the loop variable to an accumulator called _args.

The second component of the loop body uses $(if) and $(word) to check the length of _args. The $(word) function returns the nth word from a list, or an empty string if there are fewer than n words in the list. The $(if) function expands its second argument (the then clause) only if its first argument (the condition) expands to a non-empty string, so together these functions check if _args has the desired number of words, and if so the then clause of the $(if) is expanded.

The then clause of this $(if) has two components. The first constructs a completed command-line by concatenating the base command-line, here given by $1, the first argument to the original max_args call; the accumulated arguments; and a newline character. Thanks to the rules of gmake expansion, this command-line is added to the overall expansion result for the max_args function. The second part of the then clause uses $(eval) to reset the accumulator

If the chunk size does not evenly divide the number of arguments, the stragglers are emitted in a final command-line on the last line of max_args.

Limitations

max_args is handy but it has one significant limitation: command-line length limits are based on the number of bytes in the command-line, not the number of words, in it. Unfortunately, gmake has no built-in way to count the number of characters in a string. gmake does provide the $(words) built-in, so that’s what max_args uses. That just means that to use it effectively you have to take a guess at the number of arguments that will fit in a single command-line, for example by dividing the length limit by the average number of characters in each argument, then subtracting some to allow some buffer for outliers.

post

6 reasons your development team should be using instant messaging

The ElectricAccelerator development team sits at desks less than 30 feet apart, but despite our close proximity, we don’t often speak to one another. To an outside observer this may seem to be a sign of disfunction in the team — after all, developers have to communicate to work effectively. Some people think we’re obviously not communicating, but the truth is that we’re not obviously communicating! That’s because we use instant messaging for most of our communications, including status updates, technical collaboration and even code reviews, rather than face-to-face conversations. I believe this has made my team more connected and more productive. Here are six reasons why instant messaging trumps face-to-face conversations for software teams.

1. Logging

The key advantage of instant messaging is that all conversations are logged automatically. As a result I’ve got records of every conversation with every member of my team for the past two years. That’s proven invaluable on a few occasions, to provide additional context for decisions made weeks or months earlier. Obviously this is not a replacement for other types of project documentation, but it is a fantastic supplement.

2. Non-intrusive

The second most important advantage of instant messaging is that it’s relatively non-intrusive, at least compared to a face-to-face conversation. We all know how important it is to get into and preserve a state of flow when programming. Spoken conversations, by social convention, command your immediate attention — effectively an interrupt of the highest order. When somebody comes to my desk to ask me something in person, they are implicitly saying, “What I have to say to you is more important than anything else you might be doing right now.” Sometimes that’s true, but many times it’s not. And yet every time somebody initiates a face-to-face conversation with me, it destroys whatever flow I might have developed.

In contrast, instant messaging allows me to defer a response until I reach a good breaking point, so people can ask questions without interrupting me.

3. Non-disruptive

Our office has an open floor plan, which means that instead of individual offices or cubicles, we have a single big room. This layout worked very well when the company had only 6 people, who were all working on the same project. Now the company employs over 100 people, with two separate development teams working on completely different products, so the open layout doesn’t work quite so well. Conversations between other people can be very distracting when you’re heads down on a tricky technical problem. By using instant messaging instead of face-to-face conversations, we significantly reduce the distraction for our collegues.

4. Simultaneous conversations

Carrying on multiple face-to-face conversations on disparate topics is practically impossible, but doing the same via instant messenger is simple. Every IM client I’ve seen displays the last several messages of each active conversation, so you have context when a new message arrives. That signficiantly reduces the mental burden associated with each conversation, so it becomes possible to sustain several simultaneously. I often have five conversations “active” during the work day, and sometimes even more.

5. Consistency

Unlike face-to-face conversations, IM works well regardless of the relative locations of the conversants. That means that it doesn’t matter if my colleague is in the office with me, or working from home, or working from a customer site, or halfway around the world. I can use the same tool to communicate with them, which in turn means I don’t have to change the way I work to accomodate changes in the way they are working.

6. Versatility

One final advantage of instant messaging compared to face-to-face conversation is the versatility of the medium. I can trivially share a code fragment with somebody via IM, or a link to an online resource. Try doing that in a face-to-face conversation: “Yeah, you should check out the STL reference docs, at aich tee tee pee colon slash slash double you double you double you dot …”.

Instant messaging: give it a try

If you’re not already using instant messaging in your development team, give it a try. There are multiple free IM services out there, and there are good free IM clients on every platform, including smart phones, so you’ve really got nothing to lose — but you might gain a more efficient, productive team. It worked for us.

post

LEGO “Ship It!” Awards

Scriptics Connect 1.1 "Ship It!" Award

What am I supposed to do with this?


When we wrapped up the ElectricAccelerator 6.0 release recently, I wanted to give my teammates something to commemorate the release. Traditionally these are called “Ship It!” awards, and they often take the form of a Lucite plaque or trophy, or even a physical copy of the product (on DVD or CD, for example) locked inside an acrylic block. I’ve gotten a couple of those over the years, and honestly I think they’re kind of a waste. The last one I got went on a shelf to collect dust for a few years before being relocated to the trash heap, which is a shame because those things are expensive. Really expensive. I can’t even imagine the cost of the monster awards that Microsoft gives out.

So I don’t really like the usual embodiment of the “Ship It” award, but I do really like the underlying idea. After all, shipping a software release is a significant accomplishment, the culmination of months or even years of effort by a team of smart individuals. And unlike many other human endeavors, there’s nothing tangible when you’re finished — no bridge spanning the bay nor tower reaching to the heavens. Having something to commemorate the accomplishment seems fitting, and it’s another small way that I can show my appreciation for everybody’s contributions.

The Ideal “Ship It!” Award

To me, the ideal “Ship It!” award has the following attributes:

  • Themeable: I wanted something I could customize for each release, while maintaining consistency across releases. I plan to make this a tradition.
  • Inexpensive: I wanted something I could bankroll myself, so I could retain complete creative control.
  • Compact: I wanted something that wouldn’t take up much space, so it would be portable and easy to display.
  • Geek appeal: I wanted something that my teammates would think is cool. Chunks of Lucite just don’t cut it.

LEGO “Ship It!” Awards

LEGO race car driver minifig

The winner!


After a few days of idle brainstorming and bouncing ideas off my manager and co-conspirator, I had what seemed like a great idea: LEGO minifigs. I could get a bunch of a specific LEGO minifig and give one to each person on the team. It fit all my criteria. There have been over 4,000 different minifigs released since 1978, according to The Cult of LEGO. In the last two years alone LEGO has release five minifig packs, each with 16 completely new figures, so I can count on having a unique character for every feature release for the next several years. Minifigs are cheap, too — the majority can be bought for as little as a couple dollars each on Amazon or ebay. They’re obviously small. And of course, minifigs are dripping with geek appeal. What techie doesn’t like LEGO?

There was just one small problem. Minifigs are a little bit too small. There’s nowhere to put the information that would identify what it represented — the product name, release version and date, and so on. A couple more days of brainstorming gave me the solution: custom baseball cards. There are several companies that will print custom baseball cards. These outfits are obviously intended for children’s sports teams, but they will happily print cards with whatever graphic you want. You just have to create images of the front and back of your card and upload to their website. And like the minifigs themselves, the cards are inexpensive, at about $1 per card.

The ElectricAccelerator 6.0 “Ship It!” Award

For the ElectricAccelerator 6.0 “Ship It!” Award, I chose the race car driver shown above (because Accelerator is all about performance, of course!). I bought the minifigs on ebay. I spent a couple hours designing the card, then ordered them from CustomSportsProducts.com. The front shows the minifig, the product name, version, and release date, and the major new features; the back lists the names of everybody on the team. Total cost for awards for the entire team was about $40 for materials — about the cost of just one traditional Lucite-based “Ship It” award.

I was a little nervous when I presented the awards to my team a couple weeks ago, but as it turns out I needn’t have been! The reception was overwhelmingly positive. Although I hadn’t explicitly planned it this way, the minifigs actually arrived unassembled, in individual pouches. Immediately upon getting theirs, each person dumped out the pieces and started assembly — it was practically instinctive! Several people commented out loud that the award was “Awesome!” or “Really cool,” and, of course, “Kinda nerdy, but cool!” With that kind of reaction, you can bet that I’m already planning for the next release.

And finally, here’s a picture of the ElectricAccelerator 6.0 “Ship It!” Award, as it is proudly displayed on my desk:

ElectricAccelerator 6.0 "Ship It!" Award

Who doesn't love LEGO?

post

ElectricAccelerator Job Compendium

The fundamental unit of work in ElectricAccelerator is the job. Most of the time, you can think of a job as all the commands that must be run in order to create or update a single build output, but in truth that describes only one type of job. There are actually several different job types, each with a distinct purpose in the structure of a build and the way Accelerator executes the build. You can determine the type of a job from the type attribute on the <job> tag in Accelerator annotation files.

Having some familiarity with the job types and their use will make it easier for you to understand Accelerator performance and behavior, so I wrote this guide to introduce them. First I’ll describe the jobs used by ElectricMake (emake), and then the jobs used by Electrify.

ElectricMake jobs

In order to make the descriptions more concrete, I created a simple reference build that uses each of the most common job types, so you can see exactly how the jobs relate to a real build. Here’s the reference build makefile:

prog: 
        @$(MAKE) sub/prog
        @cp sub/prog ./prog

sub/prog: sub/main.c
        @cat $< > $@

setup:
        rm -rf prog sub
        mkdir sub
        echo "int main() { return 0; }" > sub/main.c
        touch prog2

To run the build, first run emake setup, which will create the files and directory structure needed by the build, then run emake –emake-maxagents=1 –emake-annodetail=basic –emake-annofile=emake.xml prog prog2. That will produce a build with thirteen different jobs, in two different make instances. Here’s an overview of how those jobs fit together (click for a larger image):


parse

The first job in any make invocation (and therefore the first job of any emake build) is a parse job, during which emake reads and interprets the makefiles used in that make instance. The output of a parse job is a list of all the jobs in the make instance, along with a list of targets that must be built, the commands to build those targets, and the relationships between them.


exist

Existence jobs (marked as type “exist” in annotation) are used to check for the existence of makefiles and command-line goals for which no rule was found. In our reference build, you can see an existence job in the top-level make for the makefile itself, as well as one for the file prog2, which has no rule in the makefile.


remake

Remake jobs are at the center of emake’s emulation of GNU make makefile remaking feature. In a remake job, emake checks every makefile that was read during the parse job for two things:

  1. Is there a rule to rebuild the makefile; and
  2. Was the makefile actually rebuilt (because it was out-of-date).

If any makefile was rebuilt, then emake restarts the make instance — all the way back to the parse job.

Because makefile remaking is a gmake-specific feature, you will only see remake jobs when emake is emulating gmake — not when it’s emulating NMAKE.

One last note about remake jobs: emake overloads the use of the <failed> tag inside a remake job to indicate not failure, but whether or not the job determined that the make instance should be restarted. If yes, then the remake job will include a <failed> tag with the code attribute set to 1; if not, then the remake job will have no <failed> tag.


rule

Rule jobs are the real workhorses of a build. Each rule job encapsulates the commands needed to update one target in the build — literally the body of a rule from a makefile — and a rule always has an associated output target (or targets), which is identified by the name attribute of the job tag in annotation. In our reference build, the rule job in the toplevel make instance corresponds to this rule in the makefile:

prog: 
        @$(MAKE) sub/prog
        @cp sub/prog ./prog

If you look at the annotation for the job you’ll notice that only $(MAKE) sub/prog is actually in the rule job — because the cp sub/prog ./prog was automatically split into a continuation job when emake detected the recursive make invocation.


follow

A follow job serves two purposes. First, it is a connection point that allows emake to tie the end job of a recursive make invocation back into the ordered list of jobs in the parent make. Second, it is the means by which emake propagates the error status of a recursive make to the parent make.

Follow jobs always have an associated rule or continuation job, identified by the partof attribute on the job tag in annotation. That is the job that spawned the recursive make which the follow job is associated with. Of course, not all rule and continuation jobs have an associated follow job — follow jobs only show up when a recursive make was invoked.


continuation

A continuation job represents the “leftover” commands that follow a recursive make invocation in a rule (or continuation) job. In our reference build, emake creates a continuation job for cp sub/prog ./prog, because that command follows a recursive make invocation. Continuation jobs, like follow jobs, are always associated with a rule (or another continuation) job, identified by the partof attribute on the job tag in the annotation.

The reason for splitting the extra commands into a separate job is simple: that allows emake to easily specify the serial order of those commands relative to the jobs in the recursive make — the commands in the continuation come after the jobs in the recursive make.


end

The last job in every make instance (and therefore the last job of every emake build) is an end job. End jobs exist primarily to handle end-of-the-make cleanup, such as removing intermediate targets or temporary inline files that were created while executing the other jobs in the make.


(Not pictured) subbuild

Subbuild jobs, which were not used in the reference build, are part of emake’s subbuild feature. They are simply a mechanism to inject a recursive make invocation into the build, just as you would get if you had a rule job with a $(MAKE) command, but without the tedium of actually running that command.

Electrify jobs

Electrify uses much of the same underlying machinery that emake does, but it has its own set of job types for its particular needs.


alpha

Every electrify build starts with an alpha job, which is a trivial placeholder marking the start of the list of jobs.


external

External jobs represent commands invoked by the electrified build tool which are distributed to the cluster. These are similar to emake’s rule jobs, except they will only ever run a single command, while rule jobs may run any number of commands.


update

Update jobs represent filesystem modifications made by commands invoked by the electrified build tool but not distributed to the cluster. These modifications are detected with the aid of electrifymon.


omega

Every electrify build ends with an omega job, which, like the alpha job, is a trivial placeholder.

Conclusion

I hope you’ve found this post informative. Questions or feedback? Please feel free to comment below!

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: