2,684
edits
Line 81: | Line 81: | ||
INTEGRATE: total elapsed wall-clock time 49.6 sec | INTEGRATE: total elapsed wall-clock time 49.6 sec | ||
Conclusions: since INIT benefits from more PROCESSORs, one could run XDS twice for fastest turnaround; the first run with JOBS=XYCORR INIT and a high number of processors (99 is maximum). The second run with JOB=COLSPOT IDXREF DEFPIX INTEGRATE CORRECT, and an optimized JOBS/PROCESSORS combination. The SNC4 mode is | Conclusions: since INIT benefits from more PROCESSORs, one could run XDS twice for fastest turnaround; the first run with JOBS=XYCORR INIT and a high number of processors (99 is maximum). The second run with JOB=COLSPOT IDXREF DEFPIX INTEGRATE CORRECT, and an optimized JOBS/PROCESSORS combination. The SNC4 mode is fastest in this example - to do better than the cache mode of the MCDRAM, one needs to adapt the forkcolspot and forkintegrate script- see [[Performance]]. Other examples (with more frames) confirmed that cache mode is best for quadrant and SNC4, and resulted in quadrant mode being superior to SNC4. To optimally use the latter, one needs to thoroughly understand and properly understand the relevant environment variables, in particular KMP_AFFINITY and KMP_PLACE_THREADS. | ||
For comparison, if these data are stored as CBFs, COLSPOT and INTEGRATE take 34.8 and 45.2 seconds, respectively, in SNC4 mode. However, with a cold cache (i.e. when data are read for the first time), the HDF5 files have an advantage because they are a factor 2.5 smaller, due to the better compression. | For comparison, if these data are stored as CBFs, COLSPOT and INTEGRATE take 34.8 and 45.2 seconds, respectively, in SNC4 mode. However, with a cold cache (i.e. when data are read for the first time), the HDF5 files have an advantage because they are a factor 2.5 smaller, due to the better compression. |