Document Actions
Running many short tasks.
Up to Table of Contents
Background
The queueing setup on stallo, or rather, the accounting system generates overhead in the start and finish of a job of about 1 second at each end of the job. This overhead is insignificant when running large parallel jobs, but creates scaling issues when running a massive amount of shorter jobs. One can consider a collection of independent tasks as one large parallel job and the aforementioned overhead becomes the serial or unparallelizable part of the job. This is because the queuing system can only start and account one job at a time. This scaling problem is described by Amdahls Law.
Without going into any more details, let's look at the solution.
Running tasks in parallel within one job.
By using some shell trickery one can spawn and load-balance multiple independent task running in parallel within one node, just background the tasks and poll to see when some task is finished until you spawn the next:
for t in $tasks; do
./dowork.sh $t &
activetasks=$(jobs | wc -l)
while [ $activetasks -ge $maxpartasks ]; do
sleep 1
activetasks=$(jobs | wc -l)
done
done
wait
Complete examples with descriptive comments can be found here: partasks.sh, dowork.sh.

