Cluster Installation: Difference between revisions
Spothineni (talk | contribs) |
|||
(17 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
The following does ''not'' refer to the [http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#CLUSTER_NODES= CLUSTER_NODES=] setup. The latter does ''not'' require a queueing system! | |||
XDS can be run in a cluster using any batch job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. These are distributed resource management system which monitor the CPU and memory usage of the available computing resources and schedule jobs to the least used computers. | |||
In order to setup XDS | == setup of XDS for a batch queue system == | ||
In order to setup XDS for a queuing system, the ''forkxds'' script needs to be changed to use qsub instead of ssh. Example scripts used for Univa Grid Engine (UGA) at Diamond (from https://github.com/DiamondLightSource/fast_dp/tree/master/etc/uge_array - thanks to Graeme Winter!) are below; they may need to be changed for the specific environment and queueing system. | |||
<pre> | <pre> | ||
# | # forkxds | ||
#!/bin/bash | |||
# forkxds Version DLS-2017/08 | |||
# | |||
# enables multi-tasking by splitting the COLSPOT and INTEGRATE | |||
# steps of xds into independent jobs. Each job is carried out by | |||
# a Fortran main program (mcolspot, mcolspot_par, mintegrate, or | |||
# mintegrate_par). The jobs are distributed among the processor | |||
# nodes of the NFS cluster network. | |||
# | |||
# 'forkxds' is called by xds or xds_par by the Fortran instruction | |||
# CALL SYSTEM('forkxds ntask maxcpu main rhosts'), | |||
# ntask ::total number of independent jobs (tasks) | |||
# maxcpu ::maximum number of processors used by each job | |||
# main ::name of the main program to be executed; could be | |||
# mcolspot | mcolspot_par | mintegrate | mintegrate_par | |||
# rhosts ::names of CPU cluster nodes in the NFS network | |||
# | |||
# DLS UGE port of script to operate nicely with cluster | |||
# scheduling system - will work with any XDS usage but is | |||
# aimed for fast_dp see fast_dp#3. Options passed through environment: | |||
# | |||
# FORKXDS_PRIORITY - priority within queue, e.g. 1024 | |||
# FORKXDS_PROJECT - UGE project to assign for this | |||
# FORKXDS_QUEUE - queue to submit to | |||
ntask=$1 #total number of jobs | ntask=$1 #total number of jobs | ||
maxcpu=$2 #maximum number of processors used by each job | maxcpu=$2 #maximum number of processors used by each job | ||
main=$3 #name of the main program to be executed | |||
rm -f forkxds.params | |||
itask=1 | itask=1 | ||
while test $itask -le $ntask | while test $itask -le $ntask | ||
do | do | ||
echo $main >> forkxds.params | |||
itask=`expr $itask + 1` | itask=`expr $itask + 1` | ||
done | done | ||
== | # save environment | ||
echo "PATH=$PATH" > forkxds.env | |||
echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> forkxds.env | |||
# check environment for queue; project; priority information | |||
qsub_opt="" | |||
if [[ -n "$FORKXDS_PRIORITY" ]] ; then | |||
qsub_opt="$qsub_command -p $FORKXDS_PRIORITY" | |||
fi | |||
[[ | if [[ -n "$FORKXDS_PROJECT" ]] ; then | ||
qsub_opt="$qsub_command -P $FORKXDS_PROJECT" | |||
fi | |||
if [[ -n "$FORKXDS_QUEUE" ]] ; then | |||
qsub_opt="$qsub_command -q $FORKXDS_QUEUE" | |||
fi | |||
qsub $qsub_opt -sync y -V -cwd -pe smp $maxcpu -t 1-$ntask `which forkxds_job` | |||
</pre> | </pre> | ||
<pre> | <pre> | ||
# forkxds_job | |||
#!/bin/bash | |||
bin | |||
==== | params=$(awk "NR==$SGE_TASK_ID" forkxds.params) | ||
JOB=`echo $params | awk '{print $1}'` | |||
# load environment | |||
. forkxds.env | |||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH | |||
export PATH=$PATH | |||
echo $SGE_TASK_ID | $JOB | |||
</pre> | </pre> | ||
== Performance == | |||
Cluster nodes may have different numbers of processors. | |||
Please note that the output line | |||
number of OpenMP threads used NN | |||
in COLSPOT.LP and INTEGRATE.LP may be incorrect if MAXIMUM_NUMBER_OF_JOBS > 1, and the submitting node (the node that runs xds_par) has a different number of processors than the processing node(s) (the nodes that run mcolspot_par and mintegrate_par). The actual numbers of threads on the processing nodes may be obtained by | |||
grep PARALLEL COLSPOT.LP | |||
grep USING INTEGRATE.LP | uniq | |||
The algorithm that determines the number of threads used on a processing node is: | |||
NB = DELPHI / OSCILLATION_RANGE # this may be slightly adjusted by XDS if DATA_RANGE / NB is not integer | |||
NCORE = number of processors in the processing node, obtained by OMP_GET_NUM_PROCS() | |||
if MAXIMUM_NUMBER_OF_PROCESSORS is not specified in XDS.INP then MAXIMUM_NUMBER_OF_PROCESSORS = NCORE | |||
number_of_threads = MIN( NB, NCORE, MAXIMUM_NUMBER_OF_PROCESSORS, 99 ) | |||
This is implemented in BUILT=20191015 onwards. |
Latest revision as of 13:48, 29 November 2019
The following does not refer to the CLUSTER_NODES= setup. The latter does not require a queueing system!
XDS can be run in a cluster using any batch job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. These are distributed resource management system which monitor the CPU and memory usage of the available computing resources and schedule jobs to the least used computers.
setup of XDS for a batch queue system
In order to setup XDS for a queuing system, the forkxds script needs to be changed to use qsub instead of ssh. Example scripts used for Univa Grid Engine (UGA) at Diamond (from https://github.com/DiamondLightSource/fast_dp/tree/master/etc/uge_array - thanks to Graeme Winter!) are below; they may need to be changed for the specific environment and queueing system.
# forkxds #!/bin/bash # forkxds Version DLS-2017/08 # # enables multi-tasking by splitting the COLSPOT and INTEGRATE # steps of xds into independent jobs. Each job is carried out by # a Fortran main program (mcolspot, mcolspot_par, mintegrate, or # mintegrate_par). The jobs are distributed among the processor # nodes of the NFS cluster network. # # 'forkxds' is called by xds or xds_par by the Fortran instruction # CALL SYSTEM('forkxds ntask maxcpu main rhosts'), # ntask ::total number of independent jobs (tasks) # maxcpu ::maximum number of processors used by each job # main ::name of the main program to be executed; could be # mcolspot | mcolspot_par | mintegrate | mintegrate_par # rhosts ::names of CPU cluster nodes in the NFS network # # DLS UGE port of script to operate nicely with cluster # scheduling system - will work with any XDS usage but is # aimed for fast_dp see fast_dp#3. Options passed through environment: # # FORKXDS_PRIORITY - priority within queue, e.g. 1024 # FORKXDS_PROJECT - UGE project to assign for this # FORKXDS_QUEUE - queue to submit to ntask=$1 #total number of jobs maxcpu=$2 #maximum number of processors used by each job main=$3 #name of the main program to be executed rm -f forkxds.params itask=1 while test $itask -le $ntask do echo $main >> forkxds.params itask=`expr $itask + 1` done # save environment echo "PATH=$PATH" > forkxds.env echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> forkxds.env # check environment for queue; project; priority information qsub_opt="" if [[ -n "$FORKXDS_PRIORITY" ]] ; then qsub_opt="$qsub_command -p $FORKXDS_PRIORITY" fi if [[ -n "$FORKXDS_PROJECT" ]] ; then qsub_opt="$qsub_command -P $FORKXDS_PROJECT" fi if [[ -n "$FORKXDS_QUEUE" ]] ; then qsub_opt="$qsub_command -q $FORKXDS_QUEUE" fi qsub $qsub_opt -sync y -V -cwd -pe smp $maxcpu -t 1-$ntask `which forkxds_job`
# forkxds_job #!/bin/bash params=$(awk "NR==$SGE_TASK_ID" forkxds.params) JOB=`echo $params | awk '{print $1}'` # load environment . forkxds.env export LD_LIBRARY_PATH=$LD_LIBRARY_PATH export PATH=$PATH echo $SGE_TASK_ID | $JOB
Performance
Cluster nodes may have different numbers of processors. Please note that the output line
number of OpenMP threads used NN
in COLSPOT.LP and INTEGRATE.LP may be incorrect if MAXIMUM_NUMBER_OF_JOBS > 1, and the submitting node (the node that runs xds_par) has a different number of processors than the processing node(s) (the nodes that run mcolspot_par and mintegrate_par). The actual numbers of threads on the processing nodes may be obtained by
grep PARALLEL COLSPOT.LP grep USING INTEGRATE.LP | uniq
The algorithm that determines the number of threads used on a processing node is:
NB = DELPHI / OSCILLATION_RANGE # this may be slightly adjusted by XDS if DATA_RANGE / NB is not integer NCORE = number of processors in the processing node, obtained by OMP_GET_NUM_PROCS() if MAXIMUM_NUMBER_OF_PROCESSORS is not specified in XDS.INP then MAXIMUM_NUMBER_OF_PROCESSORS = NCORE number_of_threads = MIN( NB, NCORE, MAXIMUM_NUMBER_OF_PROCESSORS, 99 )
This is implemented in BUILT=20191015 onwards.