Use ALL your cores with autoson!

No matter how fast computers get, I always find myself waiting on various programs to finish.  I guess it’s only natural to push the search or enumeration or whatever just one step, one rank, one dimension further, in case something interesting or illuminating is hiding just out of view.

Most machines these days have multiple cores / threads so waiting on a single job to finish is only using half (or 1/4, or 1/8 or whatever) of the computer. In this case it’s tempting to break the task up into lots of little pieces and run 2 (or 4 or 8 or whatever) simultaneously. The trouble is that monitoring which sub-jobs have finished, starting new ones and generally managing the process is both time-consuming and error-prone.

But there’s a simple solution to this – a beautiful, easy-to-use tool written by Brendan McKay, called autoson. It dates back many years, when we used to distribute large tasks over networks of Sun workstations overnight in the student computer labs, but still works well today on pretty much any Unix-based system (e.g. Linux or Mac OSX).  The basic principle is this:

  • A text file in your home directory, which is created and managed by the component programs of autoson,  is used to maintain a queue of jobs that remain to be run.
  • A number of small programs run permanently on your computer (or multiple computers) waking up every minute or so and checking the queue to see if anything is waiting to be run.
  • If a job is waiting on the queue, the client program starts that job, checks it periodically until it finishes and only then reverts to examining the queue for more jobs to run.
  • There are tools for checking, manipulating and adding/removing jobs from the queue, including commands for adding hundreds or thousands of parameterized jobs in one step.

There are probably many other tools for distributing jobs over multiple machines and/or multiple cores, with hugely sophisticated mechanisms for load-balancing and even shifting jobs from one machine to another. Maybe some of them are even as free, simple and easy-to-use as autoson, but seeing it does all I want it to do, I’ve not investigated further.

There are a few minor wrinkles in setting it up for the first time, mostly in knowing which parts of the extensive documentation you can safely defer till you need them, so I thought I’d write this post as a sort of “Autoson in Ten Minutes” introduction assuming a minimum amount of Unix command-line knowledge (though you do need to know what the command-line IS and how to start up a terminal). Also, this is how to set it up to run multiple simultaneous processes on one machine, rather than using multiple machines, though it’s pretty much just as easy as long as there’s some shared disk area to store the queue.

Step One (Download and Install)

  1. Download the gzipped file autoson146.tar.gz from Brendan’s autoson page at http://cs.anu.edu.au/~bdm/autoson and save it somewhere convenient (your home directory is fine).
  2. Uncompress the file, creating a directory autoson146, change into that directory and make the executables from source (as is conventional, anything following the $ is what you type into the Unix-like command line).
$ tar zxf autoson146.tar.gz
$ cd autoson146
$ make

Step Two (Set Up)

  1. Add the autoson146 to your path so that you can run any of the commands aulook, auadd etc. just by typing their names into the command line.
  2. Create as many copies of the file aurun0 as you want to have simultaneous processes on the one machine, and name them aurun1, aurun2,aurun3 etc.
  3. Create an empty file in your home directory called autoson.queue
  4. Use the command hostname to find out the name of your machine, and then create a file in your home directory called startaurun.cmds that contains a single line with the name of your machine followed by the numbers 0 1 2 3, etc. one number for each aurunX file that you created. (For example, verticordia.maths.uwa.edu.au 0 1 2 3)
  5. Run the command startaurun_singleuser – this should start each of the processes aurun0, aurun1, etc. If you have a problem here it is almost certainly your executable path (the list of directories where a Unix-like system searches for an executable when you type a command) that is incorrectly set.

Step Three (Test and Run)

Now we’re ready to check it out. Type the command aulook and the system should simply respond The queue is empty.

To add jobs to the queue, use the command auadd -noyawn followed by the name of the command, including any command-line arguments. (The default setting of yawn means that the job only runs when nobody is interactively using the machine, which is good if you’re using someone else’s machine, but mostly not needed when using your own.)

The real power and use however comes when a job is broken up into a large number of sub-jobs. For example, suppose you break up the input to some process into a few hundred parts called part000, part001, and you have a script called process taking a single integer argument that will process the part identified by the argument. Then the command

$ auadd -noyawn -cyc 0 -lim 235 process \###

will queue up 236 jobs called process 000, process 001, all the way up to process 235.  If any of the parts crashes for some reason, I’ll discover this next time I use aulook.  Then, in my case, four jobs at a time will run, each using one core and the entire process will take (roughly) 1/4 of the elapsed time that it would do otherwise. When my new 6-chip, 24-core, 48-thread machine arrives, I’ll scale up accordingly

Of course, this just scratches the surface of the benefits of distributing a large computation – there is immediate feedback on how long the entire thing is likely to take, there are partial results to examine as each sub-job completes, other tasks can be interleaved with the jobs on the queue, or take priority over them etc. If the system has to be rebooted when 170 out of the 235 jobs have completed, then you can start over from sub-job 171. If your application is suited to it, distributed computation really gives you (almost) all the power of parallel computing with (almost) none of the costs of parallel programming. And fortunately, combinatorial computing is one application that is almost always very very suitable for distribution.

I’ve also only barely scratched the surface of what autoson can do, particularly in a multi-machine, multi-user environment, but hopefully this will encourage a few more people to use one of Brendan’s less-well-known, but still excellent programs.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Up ↑

%d bloggers like this: