next up previous contents
Next: Running PVM programs Up: Running parallel programs interactively Previous: Running parallel programs interactively   Contents

Acquiring nodes from the PBS system

The cluster uses the Portable Batch Scheduling system to manage the resources effectively. To run a parallel program, the user needs to request nodes from the PBS system. The master node is a shared resource and is always allocated to the user. The compute nodes are allocated in an exclusive mode. Currently there is a time limit of 24 hours for the use of compute nodes at a time.

To check the status of the nodes in the PBS system, there are two choices:

To request n nodes, use the command pbsget on the master node. Here is a sample session.

[amit@onyx ~]$ pbsget -4

#####################################################################
Allocate cluster nodes via PBS for running interactive parallel jobs.
#####################################################################

Trying for 4 nodes

**********************************************************************
 Scheduling an interactive cluster session with PBS.
 Please end session by typing in exit.

 Use qstat -n to see nodes allocated by PBS.

 You may now run MPI, pvm, xpvm or pvmrun. They will automatically use
 only the nodes allocated by PBS.
 If you are using pvm or xpvm, then  please always halt the pvm
 system before exiting the PBS cluster session.
 If you are using MPI, then  please always halt the MPI daemons
 system before exiting the PBS cluster session using mpdallexit command.

 For running LAM MPI programs use the following command:
     mpirun i-np <#copies> [options]  <program> [<prog args>]
 For running MPICH2 MPI programs use the following command:
     mpiexec -n <#copies> [options]  <program> [<prog args>]
 For running PVM programs use the following command:
 Usage: pvmrun -np <#copies> <executable> {<args>,...}
**********************************************************************

qsub: waiting for job 3613.onyx.boisestate.edu to start
qsub: job 3613.onyx.boisestate.edu ready

[amit@onyx PBS ~]:qstat -n

onyx.boisestate.edu:
                                                            Req'd  Req'd
Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S
Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- -
-----
3613.onyx.boise amit     interact STDIN       17742   5  --    --  00:30 R
--
   node00/0+node20/0+node19/0+node18/0+node17/0
[amit@onyx PBS ~]:echo $PBS_NODEFILE
/var/spool/pbs/aux/3613.onyx.boisestate.edu
[amit@onyx PBS ~]:cat $PBS_NODEFILE
node00
node20
node19
node18
node17
[amit@onyx PBS ~]:exit
logout

qsub: job 3613.onyx.boisestate.edu completed
[amit@onyx ~]$

The command pbsget attempts to allocate the requested number of nodes from the PBS system. If it succeeds, it starts a new shell with the prompt modified to have PBS in the prompt. Note that the environment variable PBS_NODEFILE contains the name of a file that contains the list of nodes allocated by PBS to the user. Now the user can run either PVM or MPI parallel programs. When the user is done they would type exit to end the interactive PBS session.

If the required number of nodes are not available, then pbsget will wait. A user can cancel the request by typing in Ctrl-c and try again later. Remember to use qstat -n to check the status of the nodes.


next up previous contents
Next: Running PVM programs Up: Running parallel programs interactively Previous: Running parallel programs interactively   Contents
Amit Jain 2010-09-02