Top posts for September 2010

The following articles generated the most traffic in September. These are all articles that I wrote for the Electric Cloud Blog, because (for now!) those articles get more views than those on

  1. Makefile performance: $(shell): 28% of page views
  2. A second look at SCons performance: 18% of page views
  3. The last word on SCons performance: 13% of page views
  4. What’s new in GNU make 3.82: 7% of page views
  5. How scalable is SCons?: 6% of page views

The popularity of the $(shell) article is a bit surprising — it’s just one of several articles about makefile performance, after all, and none of the others show up on this list. Also, it’s a really old article (from March 2009). What’s going on there? Look at the search queries that bring people to that post:

  1. makefile shell
  2. makefile shell command
  3. make shell
  4. shell command in makefile
  5. makefile $(shell

Conspicuously absent from these queries: any mention of performance. I think people are just looking for help understanding how gmake and the shell fit together in general, rather than specifically how to get the best performance. Maybe I should write a tutorial — the definitive guide to gmake and the shell?

An Agent Utilization Report for ElectricInsight

A few weeks ago I showed how to determine the number of agents used during an ElectricAccelerator build, using some simple analysis of the annotation file it generates. But, I made the unfortunate choice of a pie chart to display the results, and a couple of readers called me to task for that decision. Pie charts, of course, are notoriously hard to use effectively. So, it was back to the drawing board. After some more experimentation, this is what I came up with:


Some readers have said that this graph is confusing. Blast! OK, here’s how I read it:

The y-axis is number of agents in use. The x-axis is cumulative build time, so a point at x-coordinate 3000 means that for a total of 3000 seconds the build used that many agents or more. Therefore in the graph above, I can see that this build used 48 agents for about 2200 seconds; it used 47 or more agents for about 2207 seconds; etc.

Similarly, you can determine how long the build ran with N agents by finding the line at y-coordinate N and comparing the x-coordinates of the start and end of that line. For example, in the graph above the line for 1 agent starts at about 3100 seconds and ends at about 4100 seconds, so the build used just one agent for a total of about 1000 seconds.

Here’s what I like about this version:

  • At a glance we can see that this build used 48 agents most of the time, but that it used only one agent for a good chunk of time too.
  • We can get a sense of the health of the build, or it’s parallel-friendliness, by the shape of the curve— a perfect build will have a steep drop-off far to the right; anything less than that indicates an opportunity for improvement.
  • We can see all data points, even those of little significance (for example, this build used exactly 35 agents for several seconds). The pie chart stripped out such data points to avoid cluttering the display.
  • We can plot multiple builds on a single graph.
  • It’s easier to implement than the pie chart.

Here are some more examples:

Example of a build with great parallelismExample of a build with good parallelism
Example of a build with OK parallelismExample of a graph showing two builds at once

A glitch in the matrix

While I was generating these graphs, I ran into an interesting problem: in some cases, the algorithm reported that more agents were in use than there were agents on the cluster! Besides being impossible, this skewed my graphs by needlessly inflating the range of the y-axis. Upon further investigation, I found instances of back-to-back jobs on a single agent with start and end times that overlapped, like this:

<job id="J00000001">
  <timing invoked="1.0000" completed="2.0002" node="linbuild1-1"/>
<job id="J00000002">
  <timing invoked="2.0000" completed="3.0000" node="linbuild1-1"/>

Based on this data, it appears that there were two jobs running simultaneously on a single agent. This is obviously incorrect, but the naive algorithm I used last time cannot handle this inconsistency — it will erroneously consider this build to have had two agents in use for the brief interval between 2.0000 seconds and 2.0002 seconds, when in reality there was only one agent in use.

There is a logical explanation for how this can happen — and no, it’s not a bug — but it’s beyond the scope of this article. For now, suffice to say that it is to do with making high-resolution measurements of time on a multi-core system. The more pressing question at the moment is, how do we deal with this inconsistency?

Refining the algorithm

To compensate for overlapping timestamps, I added a preprocessing phase that looks for places where the start time of a job on a given agent is earlier than the end time of the previous job to run on that agent. Any time the algorithm detects this situation, it combines the two jobs into a single “pseudo-job” with the start time of the first job, and the end time of the last job:

    $anno indexagents
    foreach agent [$anno agents] {
        set pseudo(start)  -1
        set pseudo(finish) -1
        foreach job [$anno agent jobs $agent] {
            set start  [$anno job start  $job]
            set finish [$anno job finish $job]
            if { $pseudo(start) == -1 } {
                set pseudo(start)  $start
                set pseudo(finish) $finish
            } else {
                if { int($start * 100) <= int($pseudo(finish) * 100) } {
                    set pseudo(finish) $finish
                } else {
                    lappend events \
                        [list $pseudo(start)  $JOB_START_EVENT] \
                        [list $pseudo(finish) $JOB_END_EVENT]
                    set pseudo(start)  $start
                    set pseudo(finish) $finish

With the data thus triaged, we can continue with the original algorithm: sort the list of start and end events by time, then scan the list, incrementing the count of agents in use for each start event, and decrementing it for each end event.


You can find the updated code here at GitHub. One comment on packaging: I wrote this version of the code as an ElectricInsight report, rather than as a stand-alone script. The installation instructions are simple:

  1. Download AgentUtilization.tcl
  2. Copy the file to one of the following locations:
    • <install dir>/ElectricInsight/reports
    • (Unix only) $HOME/.ecloud/ElectricInsight/reports
    • (Windows only) %USERPROFILE%/Electric Cloud/ElectricInsight/reports
  3. Restart ElectricInsight.

Give it a try!

Blinkenlights for ElectricAcclerator

Watching builds run is boring. I mean, there’s not really much to look at, besides the build log scrolling by. And the “bursty” nature of the output with ElectricAccelerator makes things even worse, since you’ll get a long pause with no apparent progress, followed by a blast of more output than you can handle — like drinking from a fire hose. Obviously stuff is going on during that long pause, but there’s nothing externally visible. Wouldn’t it be nice to see some kind of indication of the build progressing? Something like this:

I put together this visualization to satisfy my desire for a blinkenlights display for my build. Each light represents an agent used by the build, and it lights up every time a new job is dispatched to that agent. There’s no correlation between the amount of time it takes for the light to fade and the duration of the job, since there’s no way to know a priori how long a job will take. But if the build consists primarily of jobs that are about the same length (and most builds do), then you should see a steady stream of flashes throughout.


This visualization is powered by a relative new feature in ElectricAccelerator: add –emake-monitor=host:port to the emake command-line, and emake will broadcast status messages to the specified destination using UDP. As of Accelerator 5.2.0, emake generates four types of status messages. Each message is transmitted in plain text, as a space-separated list of words. The first word indicates the type of message; the remaining words are the parameters of the message:

  • ADD_JOB jobId jobType targetName: a new job has been added to the work queue.
  • START_JOB jobId time agent: a job has started running on the specified agent.
  • FINISH_JOB jobId time: a job has finished running.
  • FINISH_BUILD: the build has completed.

All you need is a program that listens for these messages and does something interesting with them. ElectricInsight is one such program: select the File -> Monitor live build… menu option, enter the same host:port information, and Insight will render the jobs in the build in real time as they run. Not bad, but not as glitzy as I’d like.

Writing blinkenlights

My blinkenlights visualization uses just one of the messages: START_JOB. Each time it receives the message, it maps the agent named in the message to one of the lights, illuminates it, and then fades it at a fixed rate. It’s written in Tcl/Tk, naturally, using a couple great third-party extensions, so the implementation is less than 100 lines of code.

The first extension is Tkpath, which I’ve mentioned previously. I used prect items to create the “lights”, and handled the fading effect by just progressively decreasing the alpha from fully opaque to fully transparent with a series of timer events firing at a predetermined rate.

The second extension is TclUDP, which makes it trivial to connect to a UDP socket from Tcl. Once I have that socket, I can use all the regular Tcl magic like fileevent to make my script automatically respond to the arrival of a new message.

Here’s the code in full:

package require tkpath
package require udp

# fade - update the opacity of the given item to the given value.  Afterwards,
# schedules another event to update the opacity again, to a slightly smaller
# value, until the value reaches zero.

proc fade {id {count 100}} {
    global events
    .c itemconfigure a$id -fillopacity [expr {double($count) / 100}]
    incr count -5
    catch {after cancel $events($id)}
    if { $count >= 0 } {
        set events($id) [after 5 [list fade $id $count]]

# next - called whenever there is another message awaiting on the socket.

proc next {sock} {
    global ids
    set msg [read $sock]
    if { [lindex $msg 0] eq "START_JOB" } {
        set agent [lindex $msg 3]
        if { ![info exists ids($agent)] } {
            set ids($agent) [array size ids]
        fade $ids($agent)

# Set the dimensions; my test cluster has 16 agents, so I did a 4x4 layout.

set rows 4
set cols 4
set boxx 60
set boxy 60

# Set up the tkpath canvas and the "lights".

set c [::tkp::canvas .c -background black \
           -height [expr {($boxy * $rows) + 5}] \
           -width  [expr {($boxx * $cols) + 5}]]
wm geometry . [expr {($boxx * $cols) + 27}]x[expr {($boxy * $rows) + 27}]

for {set x 0} {$x < $cols} {incr x} {
    for {set y 0} {$y < $rows} {incr y} {
        set x1 [expr {($x * ($boxx + 5)) + 5}]
        set x2 [expr {$x1 + $boxx}]
        set y1 [expr {($y * ($boxy + 5)) + 5}]
        set y2 [expr {$y1 + $boxy}]
        set id [expr {($x * $rows) + $y}]
        .c create prect $x1 $y1 $x2 $y2 -rx 5 -fill #3399cc -tags a$id \
            -fillopacity 0
pack .c -expand yes -fill both
wm title . "Cluster Blinkenlights"

# Get the host and port number from the command-line.

set host [lindex [split $argv :] 0]
set port [lindex [split $argv :] 1]

# Create the udp socket, set it to non-blocking mode, then set up a fileevent
# that will trigger anytime there's data available on the socket.

set sock [udp_open $port]
fconfigure $sock -buffering none -blocking 0 -remote [list $host $port]
fileevent $sock readable [list next $sock]

# Common idiom to keep the app running indefinitely.

set forever 0
vwait forever

Future work

This is a pretty fun way to monitor the status of a build in progress, but I think there are two things that could make it even better:

  • Watch the entire cluster, instead of just one build. Because this visualization is driven by data streaming from emake, for all practical purposes it’s limited to showing the activity in a single build. I would love to instead be able to view a single display showing the entire cluster, with concurrently running builds flickering in different colors. I think that would be a really interesting display, and might provide some insight into the cluster sharing behaviors of the entire system. I think to really do that properly, we’d need to be intercepting events from every agent, but unfortunately the agent doesn’t have a feature like –emake-monitor.
  • Make it an actual physical gadget. It might be fun to wire together some LED’s, maybe controlled by an arduino or something, to make a tangible device that could sit on my desk. It’s been a long, long time since I’ve done anything like that though. Plus, if there are a lot of agents in the cluster, it may be costly and impractical to manufacture.

What do you think?

How to create arcs with Tkpath

If you use Tcl/Tk for GUI programming, you should know about Tkpath, a replacement for the built in canvas widget that adds antialiasing, full alpha transparencies and more. The project is still in its infancy, but it’s already quite usable. If you’re trying to make good-looking charts or pictures with Tk, you owe it to yourself to check it out. Seriously, take a quick look at the demos. It’s OK, I’ll wait.

One area where Tkpath shows its immaturity is in the API, which is still pretty rough around the edges. For some picture elements, like arcs, you actually have to use SVG syntax to describe the path you want. That’s obviously possible, but I don’t think that anybody would call it a particularly easy to use interface. In fact, it took me the better part of a weekend to figure out how to make arcs with Tkpath (starting with absolutely no knowledge of SVG syntax). In hopes that it will save somebody else some pain, here’s what I learned.

The path item

Unlike the built-in Tk canvas, Tkpath does not have a dedicated arc item, at least not yet. Instead you use the generic path item, which is just an interface for specifying raw SVG paths. The Tcl syntax for creating a path is deceptively simple: pathName create path pathSpec ?options?. The magic is all in pathSpec, which is where you stuff an SVG path description.

SVG path syntax

SVG path syntax is itself a rudimentary graphics programming language, somewhat reminiscent of Logo’s “turtle graphics”. Here’s what you need to know:

  • A path description is a space-separated list of drawing commands, such as moveto or line
  • Each drawing command consists of a single letter to identify the type of command, and some number of arguments, determined by the type of command.
  • Commands are processed left-to-right.
  • Each command updates the current location of the drawing pen; this location is an implicit argument for the next command. For example, the line command draws a line from the current location to the location explicitly given as an argument.
  • Commands are case-sensitive. For example, “M 100 100” is not the same as “m 100 100”. When the command name is upper-case, then coordinate arguments are treated as absolute; when the command name is lowercase, then the coordinates are interpreted relative to the current pen location.
  • The moveto command must be first in any sequence of commands.

SVG drawing commands for arcs

We only need a couple of SVG drawing commands to place arcs anywhere we like: moveto, and of course arc:

Command Syntax Description
moveto M x y The moveto moves the pen from the current location to the specified location without drawing any line or curve between the points.
arc A rx ry rot large direction x y The arc command draws an elliptical arc from the current location to the specified location x,y. rx and ry give the radius of the ellipse along the x- and y-axes. rot gives the x-axis rotation. large indicates whether the arc is at least 180 degrees. If so, it has value one; otherwise it has value zero. direction indicates the direction in which to draw the arc: one for clockwise, zero for counter-clockwise.

A simple example

For example, here’s how to create a simple image. It consists of three paths, with two arcs each:

The mouth

The mouth starts at 40,60, so we need to move the pen to that point with the moveto command:

M 40 60

The lower arc is the bottom half of a circle of radius 60, ending at 160,60. Note that we don’t start a new path yet, just tack another command onto the one we’ve started:

M 40 60 A 60 60 0 1 0 160 60

The upper arc is half of an ellipse with x-radius 60 and y-radius 40, ending where the first arc began, at 40,60. Again we can just append another command to our existing path:

M 40 60 A 60 60 0 1 0 160 60 A 60 40 0 1 1 40 60

The left eye

The left eye starts at 40,40, so again we start with a moveto command:

M 40 40

Now, we want to create the upper arc — the top half of a circle with radius 20, ending at 80,40:

M 40 40 A 20 20 0 1 1 80 40

The lower arc is the top half of an ellipse with x-radius 20 and y-radius 10, ending where the first arc began, at 40,40:

M 40 40 A 20 20 0 1 1 80 40 A 20 10 0 1 0 40 40

The right eye

The right eye has the same shape as the left, but starts at 120,40 and ends at 160,40:

M 120 40 A 20 20 0 1 1 160 40 A 20 10 0 1 0 120 40

Here’s how it looks in Tcl:

package require tkpath
wm title . "arcs demo"
pack [tkp::canvas .c -background white -width 200 -height 140]
.c create path "M 40 60 A 60 60 0 1 0 160 60 A 60 40 0 1 1 40 60"
.c create path "M 40 40 A 20 20 0 1 1 80 40 A 20 10 0 1 0 40 40"
.c create path "M 120 40 A 20 20 0 1 1 160 40 A 20 10 0 1 0 120 40"

set done 0
vwait done

Final word

Once you know how to do it, making arcs with Tkpath (or SVG, for that matter) is not too hard, although I think there’s room for a dedicated arc item, even if that’s just a simple wrapper around SVG paths.

If you want more information about Tkpath, you can try the home page, or the user manual.