|
|
Line 315: |
Line 315: |
|
| |
|
| </table> | | </table> |
|
| |
| == timings for processing sweep "e" as a function of MAXIMUM_NUMBER_OF_PROCESSORS and MAXIMUM_NUMBER_OF_JOBS ==
| |
|
| |
| The following is going to be rather technical! If you are only interested in crystallography, skip this.
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=2
| |
| MAXIMUM_NUMBER_OF_JOBS=8
| |
| we observe for the INTEGRATE step:
| |
| total cpu time used 2063.6 sec
| |
| total elapsed wall-clock time 296.1 sec
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=1
| |
| MAXIMUM_NUMBER_OF_JOBS=16
| |
| the times are
| |
| total cpu time used 2077.1 sec
| |
| total elapsed wall-clock time 408.2 sec
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=4
| |
| MAXIMUM_NUMBER_OF_JOBS=4
| |
| the times are
| |
| total cpu time used 2102.8 sec
| |
| total elapsed wall-clock time 315.6 sec
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=16 ! the default for xds_par on a 16-core machine
| |
| MAXIMUM_NUMBER_OF_JOBS=1 ! the default
| |
| the times are
| |
| total cpu time used 2833.4 sec
| |
| total elapsed wall-clock time 566.5 sec
| |
| but please note that this actually only uses 10 processors, since the default DELPHI=5
| |
| and the OSCILLATION_RANGE is 0.5°.
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=4
| |
| MAXIMUM_NUMBER_OF_JOBS=8
| |
| (thus overcommitting the available cores by a factor of 2) the times are
| |
| total cpu time used 2263.5 sec
| |
| total elapsed wall-clock time 320.8 sec
| |
|
| |
| Using
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=4
| |
| MAXIMUM_NUMBER_OF_JOBS=6
| |
| (thus overcommitting the available cores, but less severely) the times are
| |
| total cpu time used 2367.6 sec
| |
| total elapsed wall-clock time 267.2 sec
| |
|
| |
| Thus,
| |
| MAXIMUM_NUMBER_OF_PROCESSORS=4
| |
| MAXIMUM_NUMBER_OF_JOBS=6
| |
| performs best for a 2-Xeon X5570 (HT enabled, thus 16 cores) machine with 24GB of memory and a RAID1 consisting of 2 1TB SATA disks. It should be noted that the dataset has 27GB, and in 296 seconds this means 92 MB/s continuous reading. The processing time is thus limited by the disk access, not by the CPU. And no, the data are not simply read from RAM (tested by "echo 3 > /proc/sys/vm/drop_caches" before the XDS run).
| |