Run Jobs

Login Nodes

Saguaro is a cluster computer comprised of many separate compute nodes linked through high speed InfiniBand and Ethernet interconnects.

Interactive access to Saguaro is accomplished by logging into the internet facing login nodes (saguaro.fulton.asu.edu). Use of these nodes should be limited to scheduling and launching compute jobs on other nodes through the queue. There are a limited number of login nodes and these are shared by all users of the system so it is impolite to run compute intensive tasks on these login nodes. In order to minimize disruptions to other users, our administrators will kill jobs running on the login nodes that consume too many resources for too long.

Compute intensive jobs should instead be run through our queue. These jobs can be run interactively or in a batch / scripted mode. Typically initial runs for compiling, testing, debugging, and tuning should use interactive mode while longer production runs would be run through batch job scripts.

Interactive Jobs

To launch an interactive job with a single CPU, ssh into the login nodes and use
qsub -I
A parallel interactive job with N processors can be launched with qsub -I -l nodes=N, for example, for 32 CPUs, type
qsub -I -l nodes=32I
To launch an interactive job with x-forwarding:
qsub –I –x

Job Scripts

To launch a batch job, you must write a batch script. Saguaro uses Moab- Terascale Open-Source Resource and QUEue manager / Portable Batch System (TORQUE/PBS) for scheduling and launching batch jobs.

A minimal job script looks something like the following:

myFirstJobscript.pbs
#!/bin/bash
#PBS -l nodes=8
#PBS -j oe
#PBS -o $PBS_JOBID.output
#PBS -l walltime=00:10:00

cd $PBS_O_WORKDIR
doSomething

The first line, -l nodes=8, tells the scheduler that you want to use 8 CPUs. The second, >-j oe, joins the output from STDOUT and STDERR into a single output file. The next line, -o $PBS_JOBID.output, specifies the filename for the output file (something like 2045259.moab.local.output). -l walltime=00:10:00 requests 0 hours, 10 minutes and 0 seconds of runtime.

The following lines contain the actual run script which will be executed on the compute nodes(s) as if you logged in to them interactively and typed those lines by hand. cd $PBS_O_WORKDIR changes directories to the variable $PBS_O_WORKDIR, which corresponds to the directory you were in when you scheduled the job. doSomething would be replaced by the command(s) required to run your program(s).

If this was stored in a file myFirstJobscript.pbs then you could submit the job by typing the following on a login node:
qsub myFirstJobscript.pbs
Once you have submitted your job, you can monitor its status in the queue by typing:
watch qstat
You can also get an estimate of when your job will start with
showstart jobid
where jobid is the unique identifier assigned to your job.