2,684
edits
(→A benchmark: KNL) |
|||
Line 45: | Line 45: | ||
On multi-socket machines, there are additional considerations having to do with their NUMA architecture - see [[Performance]]. | On multi-socket machines, there are additional considerations having to do with their NUMA architecture - see [[Performance]]. | ||
=== KNL === | |||
The benchmark was run on a single KNL7210 processor (256 cores) set to quadrant mode and using the MCDRAM as cache. XDS was compiled with the -xMIC-AVX512 option of ifort. This gives | |||
COLSPOT: elapsed wall-clock time 48.3 sec | |||
INTEGRATE: total elapsed wall-clock time 61.2 sec | |||
when run with MAXIMUM_NUMBER_OF_JOBS=16 and MAXIMUM_NUMBER_OF_PROCESSORS=16. These parameters, as well as the KNL setup could still be optimized. | |||
== Troubleshooting == | == Troubleshooting == |