2,684
edits
Line 65: | Line 65: | ||
COLSPOT: elapsed wall-clock time 40.0 sec | COLSPOT: elapsed wall-clock time 40.0 sec | ||
INTEGRATE: total elapsed wall-clock time 51.3 sec | INTEGRATE: total elapsed wall-clock time 51.3 sec | ||
This was running with a 8GB/8GB split MCDRAM. The same run, but with 8 JOBS and 32 PROCESSORS, takes | This was running with a 8GB/8GB split (''hybrid'') MCDRAM. The same run, but with 8 JOBS and 32 PROCESSORS, takes | ||
INIT.LP: elapsed wall-clock time 25.3 sec | INIT.LP: elapsed wall-clock time 25.3 sec | ||
COLSPOT: elapsed wall-clock time 40.1 sec | COLSPOT: elapsed wall-clock time 40.1 sec | ||
INTEGRATE: total elapsed wall-clock time 53.1 sec | INTEGRATE: total elapsed wall-clock time 53.1 sec | ||
Back to 16 JOBS and 16 PROCESSORS, but with MCDRAM in ''flat'' mode und <code>numactl --preferred=1 xds_par</code> (thus using all 16GB for arrays, and nothing for cache): | |||
INIT.LP: elapsed wall-clock time 29.5 sec | |||
COLSPOT: elapsed wall-clock time 38.6 sec | |||
INTEGRATE: total elapsed wall-clock time 53.2 sec | |||
Conclusion: since INIT benefits from more PROCESSORs, one could run XDS twice for fastest turnaround; the first run with JOBS=XYCORR INIT and a high number of processors (99 is maximum). The second run with JOB=COLSPOT IDXREF DEFPIX INTEGRATE CORRECT, and an optimized JOBS/PROCESSORS combination. | Conclusion: since INIT benefits from more PROCESSORs, one could run XDS twice for fastest turnaround; the first run with JOBS=XYCORR INIT and a high number of processors (99 is maximum). The second run with JOB=COLSPOT IDXREF DEFPIX INTEGRATE CORRECT, and an optimized JOBS/PROCESSORS combination. |