Lab homeEdit page

Run jobs on the supercomputer

  1. Sign up for an account on the supercomputer here.
  2. Let's walk through an example of submitting a job to the supercomputer. Here is a BASH script that performs a pointless task we can execute on a supercomputer. It takes two parameters. The first one tells it where to output its results, and the second one tells it how long to sleep (a pointless task). Of course, you will want to do something more interesting, but this task will suffice for this demo. I call this file "run.bash":
    #!/bin/bash
    set -u -e
    results=${1}
    duration=${2}
    
    echo "I will now sleep for ${duration} seconds." >> ${results}
    sleep ${duration}
    echo "Good morning." >> ${results}
    
  3. Before you proceed, I recommend that you test your job on your own computer. It will be much easier to debug it on your own machine. When you are ready to submit to the supercomputer, here is a script that makes it easy to submit a job. I call this file "submit.bash":
    #!/bin/bash
    set -u -e
    
    # Show usage
    if [ "$#" == "0" ]; then
    	exec cat <${JOB_ID}.pbs
    # Job script generated by $0
    #PBS -N ${JOB_ID}
    #PBS -q serial8core
    #PBS -j oe
    #PBS -o ${JOB_ID}.txt
    #PBS -l nodes=1
    #PBS -l walltime=12:00:00
    
    set -u -e
    echo -n "hostname: ";hostname
    cd \$PBS_O_WORKDIR
    echo "Job ID: \$PBS_JOBID"
    numprocs=\$(wc -l < \$PBS_NODEFILE)
    echo "Number of processors: \$numprocs"
    cat /proc/cpuinfo | grep MHz
    date
    echo "here goes..."
    ./${APP_NAME##*/}${APP_ARGS}
    date
    echo "done"
    ENDTXT
    
    # Submit the job
    chmod 755 ${JOB_ID}.pbs
    qsub ${JOB_ID}.pbs
    
  4. Presumably, you have a lot of jobs to run--so many that it would be very painful to submit them all by hand. So, we also need a script to submit them all. I call this "launch.bash":
    #!/bin/bash
    set -u -e
    
    for someparam in 3 5 7
    #for filename in data/*.arff
    #for (( i=0; i < 5; i++ ))
    do
    	jobname=${someparam}
    	results=${jobname}_results.txt
    
    	# Submit each job if the results file does not already exist
    	if [ ! -f ${results} ]; then
    		echo "Submitting ${jobname}..."
    
    		# Run the jobs on your local machine
    		#./run.bash ${results} ${someparam}
    
    		# Just print the submit commands
    		echo ./submit.bash ${jobname} ./run.bash\
    			${results} ${someparam}
    
    		# Actually submit the jobs to the supercomputer
    		#./submit.bash ${jobname}\
    			./run.bash ${results} ${someparam}
    	fi
    done
    
  5. So, touch up run.bash and launch.bash to do what you want them to do.
  6. Also, take a look at the lines in submit.bash that begin with "#PBS". These lines specify which queue your jobs will be submitted to, how many processors they require to run, how much time they will be allowed, and so forth. You will definitely want to customize these values before you launch a big job.
  7. Now, scp your scripts, datasets, and code over to razor.uark.edu. ssh into razor.uark.edu and run launch.bash.
  8. use the "showq" command to see your jobs in the queue. Use the "-u [username]" flag to see only your jobs.
  9. Update these instructions. (They will only continue to be helpful if we maintain them.)