Personal tools
You are here: Home Members hrn kongull.hpc.ntnu.no Batch Job Scheduleing
Navigation
 

Job Queues

by Bjørn Lindi last modified Nov 01, 2010 10:29 AM

Job scheduleing is implemented by the use of OpenPBS, Maui and Gold. OpenPBS is the batch system. Maui is the job scheduler and Gold do the accounting. The scheduling and accounting mechanism are designed such that all parties should be able to utilize their fair share of the resources. At the same time a mechanism is established such that underutilized resources may be used by all users.

 

Queue Resources Who can use it? Wall clock limit Priority
express 24 GB per node or 48 GB per node All users 1 hour High
default 24 GB per node All users 35 days Medium
bigmem 48 GB per node Users from IPT, SFI IO and Sintef Petroleum 7 days High
optimist 24 GB per node or 48 GB per node All users unlimited
Low

 

Note that a job in the optimist queue may be killed by a job submitted to default or bigmem. This happens if there are no resources available. A job running in the optimist queue must therefore be implemented with check-pointing, so that it is able to restart from the point where it was killed.

 

#!/bin/bash
#  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#  April 27, 2010
#
#  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#   ___________________________________________________________________
#  |                                                                   | 
#  | Set initial information for the Queuing system                    | 
#  | ==============================================                    | 
#  |                                                                   | 
#  | All PBS directives (the lines starting with #PBS) below           |
#  | are optional and can be omitted, resulting in the use of          | 
#  | the system defaults.                                              | 
#  |                                                                   | 
#  | Please send comments and questions to support-kongull@hpc.ntnu.no | 
#  |                                                                   | 
#  |___________________________________________________________________|
#
#  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#PBS -lnodes=2:ppn=12
#
#    Number of nodes and processes per node (ppn) requested. Here we ask for a
#    total of 24 processes, since there are 2 CPU's on each node, each with 6
#    cores. This is only necessary for parallel jobs.  This creates a
#    file, accessible through the environment variable $PBS_NODEFILE in
#    this script, that can be used by mpirun etc., see below.
#
#   
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#PBS -lwalltime=12:00:00
#
#    Expecting to run for up to 12 hours.
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#PBS -lpmem=2000MB
#
#    Expecting to use 2000 megabytes of memory per process, this is
#    different from what we use on SMPs where we supply total memory for
#    the whole job.  Remark: According to the queuing system, most of the
#    nodes have 16000MB of memory which is slightly less than 16GB, so using
#    2000MB will let the queuing system pack 8 processes on one node,
#    while asking for 2GB will get you only 7 processes per node.
#
#    The pmem parameter is not enforced, it is only a help for the scheduler 
#    to select nodes with enough free memory.
#    If you want this to be a hard limit you should use pvmem=XMB instead, then
#    your application will be killed if it passes the limit.
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#(not used)
###PBS -lfile=10gb
#
#    To specify the amount of disk space that needs to available on a node.
#    Only necessary if you are using the local work area on the nodes: /local/work
#    REMARK: Do not activate this (by removing two #) if you do not use the local
#    work area.
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#PBS -m abe
#
#    The queuing system will send an email on job Abort, Begin, End
#    Remember to specify a .forward file in your home directory with
#    your actual e-mail address. Otherwise use the following option:
#PBS -M << email address >>
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#   ______________________________________________________________
#  |                                                              | 
#  |                                                              |
#  |  Setting up and running your job on kongull                  |
#  |  ========================================                    |
#  |                                                              |
#  |  We are now ready to begin running commands.                 |
#  |                                                              |
#  |  This job script is run as a regular shell script on the     |
#  |  first node assigned to this job, hereafter called Mother    |
#  |  Superior.                                                   |
#  |______________________________________________________________|

cd $PBS_O_WORKDIR

mpirun -hostfile $PBS_NODEFILE ./mpi_test.x

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
#    Running your MPI job by launching it with "mpirun". This works for
#    when using OpenMPI. I you are using some other MPI distribution you may
#    have to use a different launcher.
#
#    Don't put the job commands in the background, e.g. by adding & at
#    the end.  Doing so will make the job escape the queuing system, 
#    create havoc with the scheduling and result in you getting angry
#    email from the sysadmins.
#
#    The job ends when the script exit.
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

# This patchscript is copied from stallo.uit.no and modified for use on kongull.hpc.ntnu.no
Document Actions