The goto sin

This is the best rant on the demise of the goto statement I have ever heard. It is from the Tango conference 2008 – Fibers talk by Mikola Lysenko. If you fast forward to 23 min 05 secs, you will hear this:

One way you can think about states [in a state machine] is that they’re kind of a label. And you put this label here for where you want the code to goto after you’re done with this other, sort of, state. A state machine is kind of an indirect goto and so these states and switch statements are just like nested gotos within gotos.

Now if you use coroutines, you can then use structured programming to represent the states. I mean this is a debate that was you know, played out years ago in a more limited context of structured programming versus gotos and ultimately, structured programming won out.

Nowadays I mean if you go into entry level programming course, they’ll just fail you on your projects if you even use a goto statement because those goto statements are that toxic to the integrity of programming code. You’re much better off using structured programming. And yet despite that, people still advocate using these state machines which are basically a really indirect horrible obfuscated goto mess, just split across multiple source files and larger projects.

So it’s like if you make the goto sin big enough, then no one is going to call you on how bad it is! But the thing is if you use coroutines, not a problem! So why have we been using this all along? Oh once again I’d probably appeal to the fact that, ah, you know, ignorance, right, people just don’t know about coroutines.

Now I want my goto back… er… coroutines!

Overriding GNU Make SHELL variable for massive parallel make

If you use GNU Make in your verification environment, maybe you have dreamed of typing make -j 120. It turns out it is possible, and it is a very interesting use of the GNU Make SHELL variable. You know that make -j N causes GNU make to spawn the build of N prerequisites in parallel. In the code below, prereq1, prereq2 and prereq3 would be built at the same time if -j 3 were used:

SHELL=./myshell
prereq1:
        very_long_processing
prereq2:
        more_very_long_processing
prereq3:
        more_and_more_processing

target: prereq1 prereq2 prereq3
        very_long_compile

This may look like nothing, but it gets more interesting when you define myshell as follows, as explained on the GNU mailing list:

#!/bin/bash
ssh `avail_host` "cd $PWD || exit; $2"

Now just type make and your build is distributed to 3 available hosts, as returned by the avail_host script (writing this script is an exercise left to the reader ;-)).

Shortly after coming up with this, a bit of searching revealed that this was nothing new really. The article Distributed processing by make explains that by using the GNU Make SHELL variable, the number of jobs you can dispatch with make is no longer limited to how many cores or local CPUs you have.

The article also shows how to extend the makefile syntax to control the dispatching of commands. Once you realize that GNU Make effectively calls $SHELL by passing “-c” followed by your entire command, you also realize that you can intercept anything you want in your own implementation of $SHELL to control job submission, such as the single, double and triple equal sign syntax in the aforementioned article. All without changing anything to GNU Make source code. Wow!

Exploring a bit further on the GXP website, I have found that the GXP flow for dispatching jobs to multiple machines relies on daemons being spawned at the user space level using the ssh command. I still don’t understand why it would be necessary to spawn daemons though. I would be interested in knowing this.