Eiger: Difference between revisions

Jump to navigation Jump to search
1,729 bytes added ,  24 March 2020
m
no edit summary
mNo edit summary
(29 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Processing of [https://www.dectris.com/EIGER_X_Features.html Eiger] data is different from processing of conventional data, because the frames are wrapped into [http://www.hdfgroup.org HDF5] files (ending with .h5). However, with the [https://github.com/dectris/neggia NEGGIA plugin for XDS], processing is as straightforward as before.
Processing of [https://www.dectris.com/EIGER_X_Features.html Eiger] data is different from processing of conventional data, because the frames are wrapped into [http://www.hdfgroup.org HDF5] files (often ending with .h5). However, with the [[LIB]] feature of XDS and a suitable plugin ([https://github.com/dectris/neggia ''Neggia''] or [https://github.com/DiamondLightSource/durin ''Durin'']), processing is as straightforward as before.


== General aspects ==
== General aspects ==
# The framecache of XDS uses memory to save on I/O; it saves a frame in RAM after reading it for the first time. By default, each XDS (or mcolspot/mintegrate) job stores NUMBER_OF_IMAGES_IN_CACHE=DELPHI/OSCILLATION_RANGE images in memory which corresponds to one DELPHI-sized batch of data. This requires (number of pixels)*(number of jobs)*4 Bytes per frame which amounts to 72 MB in case of the Eiger 16M when running with MAXIMUM_NUBER_OF_JOBS=1. (If DELPHI=20 and OSCILLATION_RANGE=0.05 your computer thus has to have at least 400*72MB = 29GB of memory for each job). If it has not, the fallback is to the old behaviour of reading each frame three times (instead of once). There is an upper limit (2GB?) to the amount of memory that will be used by default; if the required memory is more than that, a message will be printed and the user must explicitly include a NUMBER_OF_IMAGES_IN_CACHE= line in XDS.INP.
# The framecache of XDS uses memory to save on I/O; it saves a frame in RAM after reading it for the first time. By default, each XDS (or mcolspot/mintegrate) job stores NUMBER_OF_IMAGES_IN_CACHE=DELPHI/OSCILLATION_RANGE images in memory which corresponds to one DELPHI-sized batch of data. This requires (number of pixels)*(number of jobs)*4 Bytes per frame which amounts to 72 MB in case of the Eiger 16M when running with MAXIMUM_NUBER_OF_JOBS=1. (If DELPHI=20 and OSCILLATION_RANGE=0.05 your computer thus has to have at least 400*72MB = 29GB of memory for each job!). If memory allocation fails, the fallback is to the old behaviour of reading each frame three times (instead of once).
# Dectris provides a library [https://github.com/dectris/neggia] for native reading of HDF5 files, which can be loaded into XDS at runtime using the <code>LIB=</code> [http://homes.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xds_parameters.html#LIB= keyword]. With this library, no conversion to CBF or otherwise is necessary. It is therefore just as fast and efficient to read HDF5 files as any other file format.
# Dectris provides the ''Neggia'' library ([https://github.com/dectris/neggia source],[https://www.dectris.com/support/downloads/sign-in binary]) for native reading of HDF5 files, which can be loaded into XDS at runtime using the <code>[[LIB]]=</code> [http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#LIB= keyword]. With this library (which can also be found at https://{{SERVERNAME}}/pub/linux_bin for Linux, and at https://{{SERVERNAME}}/pub/mac_bin for MacOS), no conversion to CBF or otherwise is necessary. It is therefore just as fast and efficient to read HDF5 files as any other file format. At Diamond Light Source, a different HDF5 format was developed, and this requires the [https://github.com/DiamondLightSource/durin/releases/latest ''Durin'' plugin]. The latter can also read the HDF5 files written by the Dectris software.


A suitable [[XDS.INP]] may have been written by the data collection (beamline) software. Latest [[generate_XDS.INP]] (<code>generate_XDS.INP xxx_master.h5</code>) or the [[Eiger#XDS_from_H5.py_script_for_generating_XDS.INP_given_a_master_.h5_file|XDS_from_H5.py script]] can be used if XDS.INP is not available.
A suitable [[XDS.INP]] may have been written by the data collection (beamline) software. Latest [[generate_XDS.INP]] (<code>generate_XDS.INP xxx_master.h5</code>) or the [[Eiger#Script_for_generating_XDS.INP_from_master.h5|XDS_from_H5.py script]] can be used if XDS.INP is not available.


== Compression ==
== Compression ==
Line 28: Line 28:
Deviating from the Xeon benchmark setup, BACKGROUND_RANGE was set to a more realistic value of 1 50 (instead of 1 9).  
Deviating from the Xeon benchmark setup, BACKGROUND_RANGE was set to a more realistic value of 1 50 (instead of 1 9).  


Using the Dectris library that makes use of the <code>LIB=</code> [http://homes.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xds_parameters.html#LIB= option] of XDS:
Using the Dectris library that makes use of the <code>[[LIB]]=</code> [http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#LIB= option] of XDS:
  INIT:            elapsed wall-clock time      30.4 sec
  INIT:            elapsed wall-clock time      30.4 sec
  COLSPOT:        elapsed wall-clock time      40.7 sec
  COLSPOT:        elapsed wall-clock time      40.7 sec
Line 57: Line 57:
== Troubleshooting ==
== Troubleshooting ==
* make sure that master.h5 and the corresponding data.h5 files remain together as collected, and '''don't rename the data.h5 files''' - they are referred to from master.h5.  If you change the names of the data.h5 files or copy them somewhere else, that link is broken unless you fix master.h5.
* make sure that master.h5 and the corresponding data.h5 files remain together as collected, and '''don't rename the data.h5 files''' - they are referred to from master.h5.  If you change the names of the data.h5 files or copy them somewhere else, that link is broken unless you fix master.h5.
* the very latest XDS (BUILT=20170215) has a problem with reading Eiger data - the master filename is not correctly constructed. The workaround is to either use the previous [ftp://turn5.biologie.uni-konstanz.de/xds/2016-dec05/ BUILT of 20161205], or to place a symlink e.g. <code>ln -s my_data_master.h5 my_data_000001.h5</code>. The next BUILT will of course fix the problem.


== XDS_from_H5.py script for generating XDS.INP given a master .h5 file ==
== Script for generating XDS.INP from master.h5 ==
This script could be made executable and put into /usr/local/bin. It requires the [https://www.dectris.com/albula.html#main_head_navigation ALBULA API] to be installed. If you get the error message
<div class="mw-collapsible mw-collapsed">
ImportError: No module named numpy.core.multiarray
Expand code section below (i.e. click on blue <code>[Expand]</code> at the end of this line if there is no code visible), download it and save as XDS_from_H5.py .  
you should
<div class="mw-collapsible-content">
yum -y install numpy
as root.
<pre>
<pre>
#!/usr/bin/python
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-


Line 72: Line 68:
__date__ = "2017/03/08"
__date__ = "2017/03/08"
__reviewer__ = ""
__reviewer__ = ""
__version__ = "0.1.0"
__version__ = "0.1.1"


import sys
import sys
Line 101: Line 97:
!    Characters to the right of an exclamation mark are comments.
!    Characters to the right of an exclamation mark are comments.
!
!
!    This file was autogenerated by XDS_from_H5.py (Oct 2015).
!    This file was autogenerated by XDS_from_H5.py (Mar 2017).
!    Please check default values before processing.
!    Please check default values before processing.
!
!
Line 112: Line 108:
!====================== DETECTOR PARAMETERS ==================================
!====================== DETECTOR PARAMETERS ==================================
  DETECTOR=%(family)s
  DETECTOR=%(family)s
!LIB= /usr/local/lib64/dectris-neggia.so
LIB= /usr/local/lib64/dectris-neggia.so
  MINIMUM_VALID_PIXEL_VALUE=0
  MINIMUM_VALID_PIXEL_VALUE=0
  OVERLOAD= %(cutoff)i ! taken from HDF5 header item
  OVERLOAD= %(cutoff)i ! taken from HDF5 header item
Line 653: Line 649:
         exit(-1)
         exit(-1)
</pre>
</pre>
</div>
</div>
Then,
* Make script executable and put into /usr/local/bin.
* Install [https://www.dectris.com/albula.html#main_head_navigation ALBULA API]
* Install numpy (yum -y install numpy) as root if you get the error message
** ImportError: No module named numpy.core.multiarray
Once XDS.INP has been generated,
* Make sure no nonsense has been extracted from master.h5.
* Make sure INCIDENT_BEAM_DIRECTION= corresponds to the experimental geometry.
* Point LIB= to where Neggia is saved (if in current directory, use <code>LIB=./dectris-neggia.so</code> i.e. specify directory!).
** Comment out LIB= if Neggia isn't used (not recommended).
* Set MAXIMUM_NUMBER_OF_JOBS= and MAXIMUM_NUMBER_OF_PROCESSORS= to similar values whose product is slightly smaller than the total number of threads on your system.


= Old way of processing Eiger data with XDS i.e. using H5ToXds =  
= Less efficient way of processing Eiger data, using conversion to CBF=  


Since the release of NEGGIA, a plugin for XDS that parallelizes the reading of images from HDF5 data, conversion to H5ToXds is not required anymore. The sections below are thus largely obsolete.
Since the release of Neggia, a plugin for XDS that parallelizes the reading of images from HDF5 data, conversion to H5ToXds should no longer required in most usage scenarios. The sections below nevertheless describe this possibility, since preliminary experience with some less common network file systems (apparently GPFS, but not NFS) seems to indicate low performance of Neggia.  


Dectris provides a library [https://www.dectris.com/news.html?page=2 H5ToXds] (Linux only!) which is needed by XDS. That program converts (as the name indicates) the HDF5 files to CBF files; however, it does not write the geometry and other information into the CBF header (therefore, [[generate_XDS.INP]] does not work with these files). As an alternative, one could use GlobalPhasing's hdf2mini-cbf program (needs autoPROC license) or, from http://www.mrc-lmb.cam.ac.uk/harry/imosflm/ver721/downloads, the eiger2cbf-osx or eiger2cbf-linux program written by T. Nakane. These programs do write a useful CBF header.
Conversion program options: Dectris provides [https://www.dectris.com/news.html?page=2 H5ToXds] (Linux only!). That program converts (as the name indicates) the HDF5 files to CBF files; however, it does not write the geometry and other information into the CBF header (therefore, [[generate_XDS.INP]] or MOSFLM does not work with these files). Alternatives are GlobalPhasing's hdf2mini-cbf program (needs autoPROC license) or, from http://www.mrc-lmb.cam.ac.uk/harry/imosflm/ver721/downloads, the eiger2cbf-osx or eiger2cbf-linux program written by T. Nakane. The latter programs do write a useful CBF header.


For faster processing (Linux only; script needs to be adapted for OSX), the [[Eiger#A_script_for_faster_XDS_processing_of_Eiger_data|shell script]] below should be copied to /usr/local/bin/H5ToXds and made executable (<code>chmod a+rx /usr/local/bin/H5ToXds*</code>). The binary H5ToXds then should be named e.g. /usr/local/bin/H5ToXds.bin - note the .bin filename extension! The script ''also'' uses RAM to speed up processing; it uses it for fast storage of the temporary CBF file that H5ToXds/eiger2cbf/hdf2mini-cbf writes, and that each parallel thread ("processor") of XDS reads. The amount of additional RAM this requires is modest (about (number of pixels)*(number of threads) bytes).
For faster processing, the [[Eiger#A_script_for_faster_XDS_processing_of_CBF-converted Eiger data|shell script]] below should be copied to /usr/local/bin/H5ToXds and made executable (<code>chmod a+rx /usr/local/bin/H5ToXds*</code>). The binary H5ToXds then should be named e.g. /usr/local/bin/H5ToXds.bin - note the .bin filename extension! The script ''also'' uses RAM to speed up processing; it uses it for fast storage of the temporary CBF file that H5ToXds/eiger2cbf/hdf2mini-cbf writes, and that each parallel thread ("processor") of XDS reads. The amount of additional RAM this requires is modest (about (number of pixels)*(number of threads) bytes).


== Benchmark using H5ToXds ==
== Benchmark using H5ToXds ==
The numbers below refer to the H5ToXds binary as used in the script below.
This was run on a single unloaded CentOS7.2 64bit machine with dual Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz , HT enabled (showing 32 processors in /proc/cpuinfo), on a local XFS filesystem (all defaults), with four JOBs and 12 PROCESSORS. The numbers below refer to the H5ToXds binary as used in the script below.


The timing, using the XDS (BUILT=20151231), is on the first run
The timing, using the XDS (BUILT=20151231), is on the first run
Line 688: Line 697:
which indicates a 24% overhead due to the HDF5-to-CBF conversion. However, one has to add to this the time for the HDF5-to-CBF conversion, which is (with 18 parallel H5ToXds jobs each converting 50 frames) 34.2 sec, so overall the "on-the-fly" route using the script below is faster than the "pre-conversion" route, at least on this machine.
which indicates a 24% overhead due to the HDF5-to-CBF conversion. However, one has to add to this the time for the HDF5-to-CBF conversion, which is (with 18 parallel H5ToXds jobs each converting 50 frames) 34.2 sec, so overall the "on-the-fly" route using the script below is faster than the "pre-conversion" route, at least on this machine.


== A script for faster XDS processing of Eiger data ==
== A script for faster XDS processing of CBF-converted Eiger data ==
<pre>
<pre>
#!/bin/bash
#!/bin/bash
# Kay Diederichs 10/2015
# Kay Diederichs 10/2015
# 3/2016 adapt for eiger2cbf-linux and hdf2min-cbf
# 3/2017 include RAMdisk creation for MacOS; only lightly tested!
# 3/2016 adapt for eiger2cbf and hdf2mini-cbf
# for the latter see https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;58a4ee1.1603 and
# for the latter see https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;58a4ee1.1603 and
# https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;a048b4e8.1603  
# https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;a048b4e8.1603  
Line 704: Line 714:
# Recommendation:
# Recommendation:
# - for the fast local directory one should use a RAMdisk (one GB size at most)
# - for the fast local directory one should use a RAMdisk (one GB size at most)
# - /dev/shm seems to be set up for that purpose on most distributions
# - /dev/shm seems to be already set up for that purpose on most Linux distributions
# - on MacOS you can easily set this up as described at http://stackoverflow.com/questions/2033362/does-os-x-have-an-equivalent-to-dev-shm
# example on MacOS for 1GB RAMdisk (needs to be repeated after booting):
# diskutil eraseVolume HFS+ RAMdisk $(hdiutil attach -nomount ram://$((2 * 1024 * 1000)))
#
#
# on MacOS the next line should then be:
# tempfile="/Volumes/RAMdisk/H5ToXds${PWD//\//_}.$3"
# and on Linux:
tempfile="/dev/shm/H5ToXds${PWD//\//_}.$3"
tempfile="/dev/shm/H5ToXds${PWD//\//_}.$3"
#
#
Line 711: Line 727:
/usr/local/bin/H5ToXds.bin $1 $2 "$tempfile" || rm "$tempfile"
/usr/local/bin/H5ToXds.bin $1 $2 "$tempfile" || rm "$tempfile"
#/usr/local/bin/eiger2cbf-linux $1 $2 "$tempfile" >& /dev/null  || rm "$tempfile"
#/usr/local/bin/eiger2cbf-linux $1 $2 "$tempfile" >& /dev/null  || rm "$tempfile"
#/usr/local/bin/eiger2cbf-osx $1 $2 "$tempfile" >& /dev/null  || rm "$tempfile"
#/usr/local/bin/hdf2mini-cbf $1 $2 "$tempfile"  || rm "$tempfile"
#/usr/local/bin/hdf2mini-cbf $1 $2 "$tempfile"  || rm "$tempfile"
ln -sf "$tempfile" $3 2>/dev/null
ln -sf "$tempfile" $3 2>/dev/null
</pre>
</pre>


= See also =


[[Performance]]


== See also ==
[https://github.com/keitaroyam/yamtbx/blob/master/doc/eiger-en.md Keitaro Yamashita's Eiger page, with some emphasis on SPring-8]
 
[[Performance]]
Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu