2,684
edits
No edit summary |
|||
Line 169: | Line 169: | ||
</pre> | </pre> | ||
As an alternative to <tt>numactl</tt>, one may use <tt>taskset</tt> or <tt>KMP_AFFINITY</tt>. | |||
If <tt>[https://github.com/RRZE-HPC/likwid/wiki likwid]</tt> would be used instead of <tt>numactl</tt> one could have much better control of affinity groups. | |||
In my tests on a 4-socket machine, the difference between runs with the original scripts and the NUMA-aware ones was a reduction of wallclock time by about 8%. With a 2-socket machine, I saw a <1% effect. But this will depend very much on the specific hardware. | |||
== Multi-socket machines in a cluster == | |||
In that case, I'd suggest to modify e.g. <tt>forkcolspot_cluster</tt> to not run <tt>mcolspot_par</tt> directly on the remote machine, but rather to run a script on that machine that checks the number of nodes, and runs <tt>mcolspot_par</tt> on the right node. |