Personal tools
You are here: Home UiT Stallo Documentation Stallo User Guide Running many short tasks.
Document Actions

Running many short tasks.

Up to Table of Contents

Recommendations on how to run a lot of short tasks on the system. The overhead in the job start and cleanup makes it unpractical to run thousands of short tasks as individual jobs on Stallo.

Background

The queueing setup on stallo, or rather, the accounting system generates overhead in the start and finish of a job of about 1 second at each end of the job. This overhead is insignificant when running large parallel jobs, but creates scaling issues when running a massive amount of shorter jobs. One can consider a collection of independent tasks as one large parallel job and the aforementioned overhead becomes the serial or unparallelizable part of the job. This is because the queuing system can only start and account one job at a time. This scaling problem is described by Amdahls Law.

Without going into any more details, let's look at the solution.

Running tasks in parallel within one job.

By using some shell trickery one can spawn and load-balance multiple independent task running in parallel within one node, just background the tasks and poll to see when some task is finished until you spawn the next:

for t in $tasks; do
  ./dowork.sh $t &
  activetasks=$(jobs | wc -l)
  while [ $activetasks -ge $maxpartasks ]; do
    sleep 1
    activetasks=$(jobs | wc -l)
  done
done
wait

Complete examples with descriptive comments can be found here: partasks.sh, dowork.sh.

by Roy Dragseth last modified Aug 26, 2010 11:59 AM Notur