Eiger: Difference between revisions

Eiger (view source)

Revision as of 15:26, 21 February 2017

430 bytes added , 21 February 2017

→‎Xeon Phi (Knights Landing, KNL)

Kay

Bureaucrats

2,719

edits

@@ Line 48: / Line 48: @@
 === Xeon Phi (Knights Landing, KNL) ===
-The benchmark was run on a single KNL7210 processor (256 cores) set to quadrant mode and using the MCDRAM as cache. XDS was compiled with the -xMIC-AVX512 option of ifort. This gives
+The benchmark was run on a single KNL7210 processor (256 cores) set to quadrant mode and using the MCDRAM as cache. The environment variable OMP_PROC_BIND was set to false (if this is not done, the scheduler seems to put all threads on one core). XDS was compiled with the -xMIC-AVX512 option of ifort. This gives
   COLSPOT:         elapsed wall-clock time       48.3 sec
   INTEGRATE: total elapsed wall-clock time       61.2 sec
@@ Line 65: / Line 65: @@
   COLSPOT:         elapsed wall-clock time       40.0 sec
   INTEGRATE: total elapsed wall-clock time       51.3 sec
-This was running with a 8GB/8GB split MCDRAM. The same run, but with 8 JOBs and 32 PROCESSORS, takes
+This was running with a 8GB/8GB split MCDRAM. The same run, but with 8 JOBS and 32 PROCESSORS, takes
   INIT.LP:         elapsed wall-clock time       25.3 sec
   COLSPOT:         elapsed wall-clock time       40.1 sec
   INTEGRATE: total elapsed wall-clock time       53.1 sec
+Conclusion: since INIT benefits from more PROCESSORs, one could run XDS twice for fastest turnaround; the first run with JOBS=XYCORR INIT and a high number of processors (99 is maximum). The second run with JOB=COLSPOT IDXREF DEFPIX INTEGRATE CORRECT, and an optimized JOBS/PROCESSORS combination.
 == Troubleshooting ==