2,669
edits
Spothineni (talk | contribs) No edit summary |
|||
(43 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
The following does ''not'' refer to the [http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#CLUSTER_NODES= CLUSTER_NODES=] setup. The latter does ''not'' require a queueing system! | |||
XDS can be run in a cluster using any batch job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. These are distributed resource management system which monitor the CPU and memory usage of the available computing resources and schedule jobs to the least used computers. | |||
== setup of XDS for a batch queue system == | |||
In order to setup XDS for a queuing system, the ''forkxds'' script needs to be changed to use qsub instead of ssh. Example scripts used for Univa Grid Engine (UGA) at Diamond (from https://github.com/DiamondLightSource/fast_dp/tree/master/etc/uge_array - thanks to Graeme Winter!) are below; they may need to be changed for the specific environment and queueing system. | |||
<pre> | |||
# forkxds | |||
#!/bin/bash | |||
# forkxds Version DLS-2017/08 | |||
# | |||
# enables multi-tasking by splitting the COLSPOT and INTEGRATE | |||
# steps of xds into independent jobs. Each job is carried out by | |||
# a Fortran main program (mcolspot, mcolspot_par, mintegrate, or | |||
# mintegrate_par). The jobs are distributed among the processor | |||
# nodes of the NFS cluster network. | |||
# | |||
# 'forkxds' is called by xds or xds_par by the Fortran instruction | |||
# CALL SYSTEM('forkxds ntask maxcpu main rhosts'), | |||
# ntask ::total number of independent jobs (tasks) | |||
# maxcpu ::maximum number of processors used by each job | |||
# main ::name of the main program to be executed; could be | |||
# mcolspot | mcolspot_par | mintegrate | mintegrate_par | |||
# rhosts ::names of CPU cluster nodes in the NFS network | |||
# | |||
# DLS UGE port of script to operate nicely with cluster | |||
# scheduling system - will work with any XDS usage but is | |||
# aimed for fast_dp see fast_dp#3. Options passed through environment: | |||
# | |||
# FORKXDS_PRIORITY - priority within queue, e.g. 1024 | |||
# FORKXDS_PROJECT - UGE project to assign for this | |||
# FORKXDS_QUEUE - queue to submit to | |||
ntask=$1 #total number of jobs | |||
maxcpu=$2 #maximum number of processors used by each job | |||
main=$3 #name of the main program to be executed | |||
rm -f forkxds.params | |||
itask=1 | |||
while test $itask -le $ntask | |||
do | |||
echo $main >> forkxds.params | |||
itask=`expr $itask + 1` | |||
done | |||
# save environment | |||
echo "PATH=$PATH" > forkxds.env | |||
echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> forkxds.env | |||
# check environment for queue; project; priority information | |||
qsub_opt="" | |||
if [[ -n "$FORKXDS_PRIORITY" ]] ; then | |||
qsub_opt="$qsub_command -p $FORKXDS_PRIORITY" | |||
fi | |||
if [[ -n "$FORKXDS_PROJECT" ]] ; then | |||
qsub_opt="$qsub_command -P $FORKXDS_PROJECT" | |||
fi | |||
[ | if [[ -n "$FORKXDS_QUEUE" ]] ; then | ||
qsub_opt="$qsub_command -q $FORKXDS_QUEUE" | |||
fi | |||
qsub $qsub_opt -sync y -V -cwd -pe smp $maxcpu -t 1-$ntask `which forkxds_job` | |||
</pre> | |||
<pre> | |||
# forkxds_job | |||
#!/bin/bash | |||
params=$(awk "NR==$SGE_TASK_ID" forkxds.params) | |||
JOB=`echo $params | awk '{print $1}'` | |||
# load environment | |||
. forkxds.env | |||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH | |||
export PATH=$PATH | |||
echo $SGE_TASK_ID | $JOB | |||
</pre> | |||
== Performance == | |||
Cluster nodes may have different numbers of processors. | |||
Please note that the output line | |||
number of OpenMP threads used NN | |||
in COLSPOT.LP and INTEGRATE.LP may be incorrect if MAXIMUM_NUMBER_OF_JOBS > 1, and the submitting node (the node that runs xds_par) has a different number of processors than the processing node(s) (the nodes that run mcolspot_par and mintegrate_par). The actual numbers of threads on the processing nodes may be obtained by | |||
grep PARALLEL COLSPOT.LP | |||
grep USING INTEGRATE.LP | uniq | |||
The algorithm that determines the number of threads used on a processing node is: | |||
NB = DELPHI / OSCILLATION_RANGE # this may be slightly adjusted by XDS if DATA_RANGE / NB is not integer | |||
NCORE = number of processors in the processing node, obtained by OMP_GET_NUM_PROCS() | |||
if MAXIMUM_NUMBER_OF_PROCESSORS is not specified in XDS.INP then MAXIMUM_NUMBER_OF_PROCESSORS = NCORE | |||
number_of_threads = MIN( NB, NCORE, MAXIMUM_NUMBER_OF_PROCESSORS, 99 ) | |||
This is implemented in BUILT=20191015 onwards. | |||