To run a parallel program, the user needs to request nodes from the PBS system. The master node is a shared resource and is always allocated to the user. The compute nodes are allocated in an exclusive mode. Currently there is a time limit of one hour for the use of compute nodes at a time in the interactive mode.
To check the status of the nodes in the PBS system:
To request n nodes, use the command pbsgeton the master node. Here is a sample session.
amit@onyx ~]$ pbsget -8 ##################################################################### Allocate cluster nodes via PBS for running interactive parallel jobs. ##################################################################### Trying for 8 nodes ***************************************************** Scheduling an interactive cluster session with PBS. Please end session by typing in exit. Use qstat -n to see nodes allocated by PBS. You may now run your mpi programs. They will automatically use only the nodes allocated by PBS. For running MPI programs use the following commands: mpiexec [options] <program> [<prog args>] ***************************************************** qsub: waiting for job 2608.onyx.boisestate.edu to start qsub: job 2608.onyx.boisestate.edu ready [amit@node14 PBS ~]:qstat -n onyx.boisestate.edu: Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time ----------------------- ----------- -------- ---------------- ------ --------- - --------- 2608.onyx.boisestate.e amit interact STDIN 0 8 8:node -- 00:30:00 R 00:00:04 node14/0+node21/0+node20/0+node19/0+node18/0+node17/0+node16/0+node15/0 [amit@node14 PBS ~]:cat $PBS_NODEFILE node14 node21 node20 node19 node18 node17 node16 node15 [amit@node14 PBS ~]:exit logout qsub: job 2608.onyx.boisestate.edu completed [amit@onyx ~]$
The command pbsget attempts to allocate the requested number of nodes from the PBS system. If it succeeds, it starts a new shell with the prompt modified to have PBS in the prompt. Note that the environment variable PBS_NODEFILE contains the name of a file that contains the list of noes allocated by PBS to the user. Now the user can run MPI parallel programs. When the user is done they would type exit to end the interactive PBS session.
If the required number of nodes are not available, then pbsget will wait. A user can cancel the request by typing in Ctrl-c and try again later. Remember to use qstat -n (or cnodes) to check the status of the nodes.