38
edits
Spothineni (talk | contribs) No edit summary |
Spothineni (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
XDS can be run in cluster mode using any command line job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. We implemented Grid Engine. It is a distributed resource management system which monitors the CPU and memory usage of the available computing resources and schedules the job to the least used computer. Grid Engine was chosen due to its high scalability, cost effectiveness, ease of maintenance and high throughput. Grid Engine was developed by Sun Microsystems | XDS can be run in cluster mode using any command line job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. We implemented Grid Engine. It is a distributed resource management system which monitors the CPU and memory usage of the available computing resources and schedules the job to the least used computer. Grid Engine was chosen due to its high scalability, cost effectiveness, ease of maintenance and high throughput. Grid Engine was developed by Sun Microsystems (Sun Grid Engine, SGE) and later acquired by Oracle and subsequently acquired by UNIVA. The latest versions became closed source, but the older ones are open source supplied with many Linux distributions including Redhat/CentOS 6.x. There is also open source Open Grid Scheduler [[http://gridscheduler.sourceforge.net/]], Son of Gridengine [[https://arc.liv.ac.uk/trac/SGE ]] | ||
Grid Engine consists of a master node daemon named sgemaster which schedules jobs to execution nodes. On each execution node a daemon named sge_execd runs a job and sends a completion signal back to sgemaster. Jobs are submitted to sgemaster using command such as qsub or using DRMAA C, JAVA or IDL bindings from any applications want to run XDS. | |||
XDS Cluster setup | |||
In order to setup XDS in cluster mode, forkcolspot and forkintegrate scripts need to be changed to access the gridengine environment and send jobs to different machines. Example scripts are below, need to be changed according to the environment. | |||
<code> | |||
#forkcolspot | |||
ntask=$1 #total number of jobs | |||
maxcpu=$2 #maximum number of processors used by each job | |||
#maxcpu=1: use 'mcolspot' (single processor) | |||
#maxcpu>1: use 'mcolspot_par' (openmp version) | |||
pids="" #list of background process ID's | |||
itask=1 | |||
echo "MAX CPU $maxcpu $image1" | |||
#Sudhir check for gridengine submit host | |||
submitnodes=`qconf -sh 2> /dev/null` | |||
thishost=`hostname` | |||
isgrid=0 | |||
for node in $submitnodes ; do | |||
if [ "$node" == "$thishost" ] | |||
then | |||
isgrid=1 | |||
echo "Grid Engine environment detected" | |||
fi | |||
done | |||
while test $itask -le $ntask | |||
do | |||
if [ $maxcpu -gt 1 ] | |||
# then echo "$itask" | mcolspot_par & | |||
# else echo "$itask" | mcolspot & | |||
then | |||
if [ $isgrid -eq 1 ] | |||
then | |||
qsub -sync y -V -l h_rt=0:20:00 -cwd \ | |||
forkcolspot_job \ | |||
$itask & | |||
#else echo "$itask" | qrsh -V -cwd "mcolspot" & | |||
else echo "$itask" | mcolspot_par & | |||
fi | |||
else echo "$itask" | mcolspot & | |||
fi | |||
pids="$pids $!" #append id of the background process just started | |||
itask=`expr $itask + 1` | |||
done | |||
trap "kill -15 $pids" 2 15 # 2:Control-C; 15:kill | |||
wait #wait for all background processes issued by this shell | |||
rm -f mcolspot.tmp #this temporary file was generated by xds | |||
rm -rf fork*job* | |||
</code> | |||
---- | |||
<code> | |||
#forkcolspot_job | |||
#!/bin/csh | |||
echo $1 | |||
set itask=$1 | |||
echo $itask | mcolspot_par | |||
</code> | |||
---- | |||
<code> | |||
#forkintegate | |||
fframe=$1 #id number of the first image | |||
ni=$2 #number of images in the data set | |||
ntask=$3 #total number of jobs | |||
niba0=$4 #minimum number of images in a batch | |||
maxcpu=$5 #maximum number of processors used by each job | |||
#maxcpu=1: use 'mintegrate' (single processor) | |||
#maxcpu>1: use 'mintegrate_par' (openmp version) | |||
minitask=$(($ni / $ntask)) #minimum number of images in a job | |||
mtask=$(($ni % $ntask)) #number of jobs with minitask+1 images | |||
pids="" #list of background process ID's | |||
nba=0 | |||
litask=0 | |||
itask=1 | |||
#Sudhir check for gridengine submit host | |||
submitnodes=`qconf -sh 2> /dev/null` | |||
thishost=`hostname` | |||
isgrid=0 | |||
for node in $submitnodes ; do | |||
if [ "$node" == "$thishost" ] | |||
then | |||
isgrid=1 | |||
echo "Grid Engine environment detected" | |||
fi | |||
done | |||
while test $itask -le $ntask | |||
do | |||
if [ $itask -gt $mtask ] | |||
then nitask=$minitask | |||
else nitask=$(($minitask + 1)) | |||
fi | |||
fitask=`expr $litask + 1` | |||
litask=`expr $litask + $nitask` | |||
if [ $nitask -lt $niba0 ] | |||
then n=$nitask | |||
else n=$niba0 | |||
fi | |||
if [ $n -lt 1 ] | |||
then n=1 | |||
fi | |||
nbatask=$(($nitask / $n)) | |||
nba=`expr $nba + $nbatask` | |||
image1=$(($fframe + $fitask - 1)) #id number of the first image | |||
if [ $maxcpu -gt 1 ] | |||
then | |||
if [ $isgrid -eq 1 ] | |||
then | |||
qsub -sync y -V -l h_rt=0:20:00 -cwd \ | |||
forkintegrate_job \ | |||
$image1 $nitask $itask $nbatask & | |||
#else echo "$image1 $nitask $itask $nbatask" | qrsh -V -cwd "mintegrate" & | |||
else echo "$image1 $nitask $itask $nbatask" | mintegrate_par & | |||
fi | |||
else echo "$image1 $nitask $itask $nbatask" | mintegrate & | |||
fi | |||
pids="$pids $!" #append id of the background process just started | |||
itask=`expr $itask + 1` | |||
done | |||
trap "kill -15 $pids" 2 15 # 2:Control-C; 15:kill | |||
wait #wait for all background processes issued by this shell | |||
rm -f mintegrate.tmp #this temporary file was generated by mintegrate | |||
rm -rf fork*job* | |||
</code> | |||
<code> | |||
#forkintegrate_job | |||
#!/bin/bash | |||
set image1=$1 | |||
set nitask=$2 | |||
set itask=$3 | |||
set nbatask=$4 | |||
set host=`uname -a | awk '{print $2}'` | |||
echo $image1 $nitask $itask $nbatask $host >> jobs.log | |||
echo $image1 $nitask $itask $nbatask | mintegrate_par | |||
</code> | |||
'''Grid Engine Installation''' | '''Grid Engine Installation''' |
edits