Generate XDS.INP

Revision as of 22:57, 15 July 2011 by Kay (talk | contribs)

This script generates XDS.INP based on a list of frame names supplied on the commandline.

Currently works for MarCCD, ADSC and Pilatus 6M detectors. Also supports bzip2-ed frames (when specifying the frame names, leave out the .bz2 extension, as documented for XDS!).

Installation: ask your system adminstrator to cut-and-paste it into e.g. /usr/local/bin/generate_XDS.INP , and to make it "executable". Usage is then just (don't forget the quotation marks!):

generate_XDS.INP "frms/mydata_1_???.img"

But you may also just cut-and-paste the script from this webpage into e.g. your home directory; the filename should be generate_XDS.INP. After creating the file, make it executable - e.g. if it's in your $HOME, use:

chmod +x ~/generate_XDS.INP

After that, you can just run it in the same way:

~/generate_XDS.INP "frms/mydata_1_???.img"

By using your own file, you can easily update to the latest revision, or even change the script, without having to bother the system administrator.

Instead of cut-and-paste, you could try

wget http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/generate_XDS.INP -O - | awk '/bash/,/rm -f tmp1 tmp2/' | \
sed -e s/\&nbsp\;/\ /g -e s/\&gt\;/\>/g -e s/\&lt\;/\</g -e s/amp\;//g -e s/\&quot\;/\"/g > generate_XDS.INP

to obtain generate_XDS.INP in your current directory.

This is the script:

#!/bin/bash
# purpose: generate XDS.INP
#
# tested with some datasets from ALS, SSRL, SLS, ESRF and BESSY; only MARCCD, ADSC/SMV, PILATUS 6M detectors; 
# for other detectors, values marked with XXX must be manually filled in.
#
# revision 0.03 . Kay Diederichs 2/2010 
# revision 0.04 . Kay Diederichs 4/2010 - include alternative ORGX, ORGY calculations for ADSC
# revision 0.05 . Kay Diederichs 5/2010 - grep for "Corrected" in addition to "marccd"; needed for BESSY 
# revision 0.06 . KD 6/2010 - add UNTRUSTED_RECTANGLE and UNTRUSTED_ELLIPSE; use `whereis catmar` and so on 
# revision 0.07 . KD 6/2010 - decide about ORGX/Y info in MAR header being pixels or mm; other fixes
# revision 0.08 . KD 6/2010 - fixes for Pilatus 6M
# revision 0.09 . KD 6/2010 - get rid of requirement for mccd_xdsparams.pl and/or catmar; rather use "od"
# revision 0.10 . Tim Gruene 7/2010 - set link 'images' to image directory if path exceeds 72 characters
# revision 0.11 . KD 7/2010 - for MarCCD: look for distance info at different byte position
# revision 0.12 . KD 7/2010 - fix for negative PHISTART
# revision 0.13 . KD 8/2010 - store correct NX NY QX QY in XDS.INP
# revision 0.14 . KD 1/2011 - SENSOR_THICKNESS for Pilatus; MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3
# revision 0.15 . KD 2/2011 - add comment for -ive sign of APS 19-ID and Australian Synchrotron rotation axis
# revision 0.16 . KD 3/2011 - SENSOR_THICKNESS=0.01 for ADSC and MarCCD. Add comment about SILICON=
# revision 0.17 . KD 3/2011 - make it work for .bz2 frames; improve screen output
# revision 0.18 . KD 4/2011 - faster by doing "strings" only once; revert "images/${1##/*/}" "correction"
# revision 0.19 . KD 6/2011 - bugfix for 0.18
# revision 0.20 . KD 7/2011 - redirect stderr of /bin/ls to /dev/null
REVISION="0.20 (15-Jul-2011)"
#
# usage: e.g. generate_XDS.INP "frms/mydata_1_???.img"
# make sure to have the two quotation marks !
# the ? are wildcards for the frame numbers.
# 
# limitations:
# - frame numbers are assumed to start with 1 and run consecutively
# 
# known problems:
# - for ADSC detectors, there are at least three ways to obtain ORGX and ORGY values from the header (see below);
# - the same might be a problem for MAR headers, too (not sure about this)
#
# notes for debugging of the script:
# - add the -v option to the first line, to see where an error occurs
# - comment out the removal of tmp1 and tmp2 in the last line
#
# ====== Start of script ======
echo generate_XDS.INP version $REVISION . Obtain the latest version from
echo http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/generate_XDS.INP
if [ "$1" == "help" ] || [ "$1" == "-help" ] || [ "$1" == "-h" ]; then
  echo usage: generate_XDS.INP \"frms/mydata_1_???.img\"   \(_with_ the quotation marks!\)
  echo if the frames are compressed with bzip2, leave out the .bz2 extension!
  exit
fi
#
# defaults:
#
DETECTOR="XXX MINIMUM_VALID_PIXEL_VALUE=XXX OVERLOAD=XXX"
ORGX=XXX
ORGY=XXX
DETECTOR_DISTANCE=XXX
OSCILLATION_RANGE=XXX
X_RAY_WAVELENGTH=XXX
QX=XXX
QY=XXX
NX=XXX
NY=XXX
SENSOR_THICKNESS=0
# see how we are called:
NAME_TEMPLATE_OF_DATA_FRAMES="$1"

# check that the image template name does not exceed 72 characters
# and set a link if necessary
if [ ${#1} -gt 72 ]; then
    TMP_PATH=$(/bin/ls -C1 $1 $1.bz2 2>/dev/null | head -1)
    TMP_FILENAME=$(basename ${TMP_PATH})
    if [ ${#TMP_FILENAME} -gt 65 ]; then
	echo "---> Unable to proceed: image filename "
	echo "--->  \"${TMP_FILENAME}\""
        echo "---> exceeds 72 characters."
        echo "---> Please rename files or set links."	
	exit
    else
	echo "---> Warning: Template name exceeds 72 characters. Setting link \"images\""
        echo "              to image directory"
	TMP_DIRNAME=$(dirname ${TMP_PATH})
	if [ -e "images" ]; then
	    echo "***  Error:   The file or directory \"images\" already exists. Please remove"
	    echo "              and re-run."
	    exit
	else
	    ln -s ${TMP_DIRNAME} images
	fi
	NAME_TEMPLATE_OF_DATA_FRAMES="images/${1##/*/}"
# KD 30.3.11 I believe the previous line should be changed to:
#      NAME_TEMPLATE_OF_DATA_FRAMES="images/${TMP_FILENAME}" 
# but I have not had the time to properly test this
	echo "              Using template filename \"${NAME_TEMPLATE_OF_DATA_FRAMES}\""
    fi
fi

# list frames matching the wildcards in NAME_TEMPLATE_OF_DATA_FRAMES
# don't accept the "direct beam" shot at SLS/Pilatus PX-I and PX-II
/bin/ls -C1 $1 $1.bz2 2>/dev/null | egrep -v "_00000.cbf|_000.img" > tmp1 || exit 1

# we can continue - the frames are found

# set upper limit of DATA_RANGE to number of frames (see "limitations" above)
DATA_RANGE=`wc -l tmp1 | awk '{print $1}'`

# set upper limit of SPOT_RANGE to half of DATA_RANGE, but not less than 1
SPOT_RANGE=`echo "scale=0; $DATA_RANGE/2" | bc -l`
SPOT_RANGE=`echo "if ($SPOT_RANGE<1) 1;if ($SPOT_RANGE>1) $SPOT_RANGE" | bc -l`

echo DATA_RANGE=1 $DATA_RANGE

# find out detector type
DET=XXX
FIRSTFRAME=`head -1 tmp1`
echo $FIRSTFRAME | grep -q bz2 && bzcat $FIRSTFRAME > tmp1 && FIRSTFRAME=tmp1
strings $FIRSTFRAME > tmp2
egrep -q 'marccd|Corrected' tmp2 && DET=mccd
grep -q PILATUS tmp2             && DET=pilatus
grep -q BEAM_CENTER_X tmp2       && DET=adsc
# identify other detector types in the same way (MAR IP would be straightforward)

# parse ASCII header of first frame

if [ "$DET" == "XXX" ]; then
  echo "this is not a MAR, ADSC/SMV or PILATUS detector - fill in XXX values manually!"
  DETECTOR="XXX MINIMUM_VALID_PIXEL_VALUE=XXX OVERLOAD=XXX"

# find parameters of first frame
elif [ "$DET" == "mccd" ]; then
  echo Data from a MarCCD detector
  
  DETECTOR="CCDCHESS MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65500"
  SENSOR_THICKNESS=0.01
  # use first frame of dataset to obtain parameters

  # offsets are documented; values can be find in mccd_xdsparams.pl script
  let SKIP=1024+80
  NX=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  let SKIP=$SKIP+4
  NY=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')

  let SKIP=1720
  DETECTOR_DISTANCE=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  DETECTOR_DISTANCE=`echo "scale=3; $DETECTOR_DISTANCE/1000" | bc -l`
    
  let SKIP=1024+256+128+256+4
  ORGX=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  ORGX=`echo "scale=2; $ORGX/1000" | bc -l `
  let SKIP=$SKIP+4
  ORGY=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  ORGY=`echo "scale=2; $ORGY/1000" | bc -l `

  let SKIP=1024+256+128+256+44
  PHISTART=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  let SKIP=1024+256+128+256+76
  PHIEND=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  OSCILLATION_RANGE=`echo "scale=3; ($PHIEND-($PHISTART))/1000" | bc -l`
  
  let SKIP=1024+256+128+256+128+4
  QX=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  QX=`echo "scale=10; $QX/1000000" |bc -l `
  let SKIP=$SKIP+4
  QY=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  QY=`echo "scale=10; $QY/1000000" |bc -l `

  let SKIP=1024+256+128+256+128+128+12
  X_RAY_WAVELENGTH=$(od -t dI --skip-bytes=$SKIP --read-bytes=4 $FIRSTFRAME | head -1 | awk '{print $2}')
  X_RAY_WAVELENGTH=`echo "scale=5; $X_RAY_WAVELENGTH/100000" | bc -l`

# at most BLs, ORGX and ORGY are in pixels, but sometimes in mm ... guess:
  NXBYFOUR=`echo "scale=0; $NX/4" | bc -l `
  ORGXINT=`echo "scale=0; $ORGX/1" | bc -l `
  if [ $ORGXINT -lt $NXBYFOUR ]; then
     ORGX=`echo "scale=1; $ORGX/$QX" | bc -l`
     ORGY=`echo "scale=1; $ORGY/$QY" | bc -l`
     echo MARCCD detector: header ORGX, ORGY seem to be in mm ... converting to pixels
  else
     echo MARCCD detector: header ORGX, ORGY seem to be in pixel units
  fi

elif [ "$DET" == "adsc" ]; then

  DETECTOR="ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000"
  echo Data from ADSC detector. Obtaining ORGX, ORGY depends on beamline setup:
  SENSOR_THICKNESS=0.01
  sed s/\;// tmp2 > tmp1
  mv tmp1 tmp2

      # find X_RAY_WAVELENGTH:
      X_RAY_WAVELENGTH=`grep WAVELENGTH tmp2 | head -1 | sed s/WAVELENGTH=//`

      # find NX, QX, ORGX and ORGY:
      NX=`grep SIZE1 tmp2 | tail -1 | sed s/SIZE1=//`
      QX=`grep PIXEL_SIZE tmp2 | sed s/PIXEL_SIZE=//`
# FIXME - next 2 lines should be done properly, from header
      NY=$NX      
      QY=$QX
      BEAM_CENTER_X=`grep BEAM_CENTER_X tmp2 | sed s/BEAM_CENTER_X=//`
      BEAM_CENTER_Y=`grep BEAM_CENTER_Y tmp2 | sed s/BEAM_CENTER_Y=//`
# fix 2010-04-26 - tell user about possible ORGX, ORGY alternatives -  
# at ESRF and ... (pls fill in!) the following should be used:
      ORGX=`echo "scale=1; $BEAM_CENTER_Y/$QX" | bc -l `
      ORGY=`echo "scale=1; $BEAM_CENTER_X/$QX" | bc -l `
      echo - at ESRF BLs use: ORGX=$ORGX ORGY=$ORGY 
# this 2nd alternative convention should be used at the following beamlines (pls complete the list): ALS 5.0.3, ...
      ORGX=`echo "scale=1; $NX-$BEAM_CENTER_X/$QX" | bc -l `
      ORGY=`echo "scale=1; $BEAM_CENTER_Y/$QX" | bc -l `
      echo - at e.g. ALS 5.0.3 use: ORGX=$ORGX ORGY=$ORGY 
# this 3rd alternative convention should be used at the following beamlines (pls complete the list): ALS 8.2.2, ... 
      ORGX=`echo "scale=1; $BEAM_CENTER_X/$QX" | bc -l `
      ORGY=`echo "scale=1; $NX-$BEAM_CENTER_Y/$QX" | bc -l `
      echo - at e.g. ALS 8.2.2 use: ORGX=$ORGX ORGY=$ORGY - this is written to XDS.INP
# the latter alternative is written into the generated XDS.INP ! You have to correct this manually in XDS.INP, or adjust this script.
      # find DETECTOR_DISTANCE and OSCILLATION_RANGE:
      DETECTOR_DISTANCE=`grep DISTANCE tmp2 | sed s/DISTANCE=//`
      OSCILLATION_RANGE=`grep OSC_RANGE tmp2 | sed s/OSC_RANGE=//`

elif [ "$DET" == "pilatus" ]; then
  DETECTOR="PILATUS MINIMUM_VALID_PIXEL_VALUE=0 OVERLOAD= 1048576  !PILATUS 6M"
  SENSOR_THICKNESS=0.32
  NX=2463 NY=2527 QX=0.172 QY=0.172
  echo Data from a Pilatus detector
  sed s/#// tmp2 > tmp1
  mv tmp1 tmp2

      # find X_RAY_WAVELENGTH:
      X_RAY_WAVELENGTH=`grep Wavelength tmp2 | sed -e s/Wavelength// -e s/A// | awk '{print $1}'`

      # find ORGX and ORGY:
      ORGX=`grep Beam_xy tmp2 | sed -e s/\(// -e s/\)// -e s/\,// | awk '{print $2}'`
      ORGY=`grep Beam_xy tmp2 | sed -e s/\(// -e s/\)// -e s/\,// | awk '{print $3}'`

      # find DETECTOR_DISTANCE and OSCILLATION_RANGE:
      DETECTOR_DISTANCE=`awk '/distance/{print $2}' tmp2`
      DETECTOR_DISTANCE=`echo "$DETECTOR_DISTANCE*1000" | bc -l`

      OSCILLATION_RANGE=`awk '/Angle/{print $2}' tmp2`

else
  echo should never come here
  exit 1
fi

echo ORGX= $ORGX ORGY= $ORGY - check these values with adxv !
echo DETECTOR_DISTANCE= $DETECTOR_DISTANCE
echo OSCILLATION_RANGE= $OSCILLATION_RANGE
echo X-RAY_WAVELENGTH= $X_RAY_WAVELENGTH

# now we know everything that is required to generate XDS.INP

cat > XDS.INP << eof
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
ORGX= $ORGX ORGY= $ORGY  ! check these values with adxv !
DETECTOR_DISTANCE= $DETECTOR_DISTANCE
OSCILLATION_RANGE= $OSCILLATION_RANGE
X-RAY_WAVELENGTH= $X_RAY_WAVELENGTH
NAME_TEMPLATE_OF_DATA_FRAMES=$NAME_TEMPLATE_OF_DATA_FRAMES
! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing  
DATA_RANGE=1 $DATA_RANGE
SPOT_RANGE=1 $SPOT_RANGE
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)

SPACE_GROUP_NUMBER=0                   ! 0 if unknown
UNIT_CELL_CONSTANTS= 70 80 90 90 90 90 ! put correct values if known
INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT

FRIEDEL'S_LAW=FALSE     ! This acts only on the CORRECT step
! If the anom signal turns out to be, or is known to be, very low or absent,
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT

! remove the "!" in the following line:
! STRICT_ABSORPTION_CORRECTION=TRUE
! if the anomalous signal is strong: in that case, in CORRECT.LP the three
! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5
!
! exclude (mask) untrusted areas of detector, e.g. beamstop shadow :
! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat
! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed
!
! parameters with changes wrt default values:
TRUSTED_REGION=0.00 1.2  ! partially use corners of detectors; 1.41421=full use
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok
MINIMUM_ZETA=0.05        ! integrate close to the Lorentz zone; 0.15 is default
STRONG_PIXEL=6           ! COLSPOT: only use strong reflections (default is 3)
MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high
REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE 

! parameters specifically for this detector and beamline:
DETECTOR= $DETECTOR
SENSOR_THICKNESS= $SENSOR_THICKNESS 
! attention CCD detectors: for very high resolution (better than 1A) make sure to specify SILICON 
! as about 32* what CORRECT.LP suggests (absorption of phosphor is much higher than that of silicon)
NX= $NX NY= $NY  QX= $QX  QY= $QY ! to make CORRECT happy if frames are unavailable
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0
DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0
INCIDENT_BEAM_DIRECTION=0 0 1
ROTATION_AXIS=1 0 0    ! at e.g. SERCAT ID-22, APS 19-ID, Australian Synchrotron this needs to be -1 0 0
FRACTION_OF_POLARIZATION=0.98   ! better value is provided by beamline staff!
POLARIZATION_PLANE_NORMAL=0 1 0
eof
echo XDS.INP is ready for use. The file has only the most important keywords.
echo Full documentation, including complete detector templates, is at
echo http://www.mpimf-heidelberg.mpg.de/~kabsch/xds . More documentation in XDSwiki
echo After running xds, inspect, using XDS-Viewer, at least the beamstop mask in
echo BKGPIX.cbf, and the agreement of predicted and observed spots in FRAME.cbf!
rm -f tmp1 tmp2