Xscale: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
 
(31 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Simple and advanced usage ==
== Simple and advanced usage ==


[http://www.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xscale_parameters.html XSCALE] ist the scaling program of the XDS suite. At the XDS website, there is a short and a long commented example of [http://www.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/INPUT_templates/XSCALE.INP XSCALE.INP]
[http://xds.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xscale_parameters.html XSCALE] is the stand-alone scaling program of the XDS suite. It scales reflection files (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT step of XDS ''already scales'' an individual dataset, XSCALE is only ''needed'' if several datasets should be scaled relative to another. However, it does not deterioriate (over-fit) a dataset if it is "scaled again" in XSCALE, since the supporting points of the scalefactors are at the same positions in detector and batch space.
 
One advantage of using XSCALE for a single dataset is that the user can specify the number and limits of the resolution shells. Another is that zero-dose extrapolation can be done.
At the XDS website, there is a short and a long commented example of [http://xds.mpimf-heidelberg.mpg.de/html_doc/INPUT_templates/XSCALE.INP XSCALE.INP]


----
----
Line 16: Line 20:
  FRIEDEL'S_LAW=FALSE   
  FRIEDEL'S_LAW=FALSE   
  STRICT_ABSORPTION_CORRECTION=TRUE        ! see XDSwiki:Tips_and_Tricks
  STRICT_ABSORPTION_CORRECTION=TRUE        ! see XDSwiki:Tips_and_Tricks
  INPUT_FILE= ../fae-rh/xds_2/XDS_ASCII.HKL
! the star in front of the file name indicates that it is the reference wrt falloff
  INPUT_FILE= *../fae-rh/xds_2/XDS_ASCII.HKL
  FRIEDEL'S_LAW=FALSE
  FRIEDEL'S_LAW=FALSE
  STRICT_ABSORPTION_CORRECTION=TRUE
  STRICT_ABSORPTION_CORRECTION=TRUE
   
   
  OUTPUT_FILE=fae-ip.ahkl  
  OUTPUT_FILE=fae-ip.ahkl  
  INPUT_FILE= ../fae-ip/xds_2/XDS_ASCII.HKL
  INPUT_FILE= ../fae-ip/xds_1/XDS_ASCII.HKL
  FRIEDEL'S_LAW=FALSE
  FRIEDEL'S_LAW=FALSE
  STRICT_ABSORPTION_CORRECTION=TRUE
  STRICT_ABSORPTION_CORRECTION=TRUE
Line 30: Line 35:
== Further keywords ==
== Further keywords ==


* RESOLUTION_SHELLS=            ! for the printout of R-factors, completeness, ...
* [http://xds.mpimf-heidelberg.mpg.de/html_doc/xscale_parameters.html#RESOLUTION_SHELLS= RESOLUTION_SHELLS=]           ! for the printout of R-factors, completeness, ...
* SPACE_GROUP_NUMBER=          ! if not given, picked up from first input reflection file
* SPACE_GROUP_NUMBER=          ! if not given, picked up from first input reflection file
* UNIT_CELL_CONSTANTS=          ! if not given, picked up from first input reflection file
* UNIT_CELL_CONSTANTS=          ! if not given, picked up from first input reflection file
=== keywords with the same meaning as in CORRECT ===
=== keywords with the same meaning as in CORRECT ===
* REIDX=
* REIDX=
* REFERENCE_DATA_SET=
* REFERENCE_DATA_SET=   ! see also [[REFERENCE_DATA_SET]]
* MINIMUM_I/SIGMA=
* MINIMUM_I/SIGMA=
* REFLECTIONS/CORRECTION_FACTOR=
* REFLECTIONS/CORRECTION_FACTOR=
Line 43: Line 48:
* MAXIMUM_NUMBER_OF_PROCESSORS=
* MAXIMUM_NUMBER_OF_PROCESSORS=
* CORRECTIONS=
* CORRECTIONS=
=== keywords which do not apply to CORRECT ===
* NBATCH=
 
=== keywords unique to XSCALE ===
* REIDX_ISET=                   ! re-index data from the most recent INPUT_FILE
* MERGE=                        ! average intensities from all input files, applies to output file
* MERGE=                        ! average intensities from all input files, applies to output file
* WEIGHT=                        ! applies to input file
* WEIGHT=                        ! applies to input file
* NBATCH=                        ! influences number of scale factors
* CRYSTAL_NAME=                  ! switch on radiation damage correction for individual reflections (f.i.r.)
* CRYSTAL_NAME=                  ! switch on radiation damage correction
* STARTING_DOSE=                ! (optional for radiation damage correction f.i.r.)  
* STARTING_DOSE=                ! (optional for radiation damage correction)  
* DOSE_RATE=                    ! (optional for radiation damage correction f.i.r.)
* DOSE_RATE=                    ! (optional for radiation damage correction)
* 0-DOSE_SIGNIFICANCE_LEVEL=    ! (optional for radiation damage correction f.i.r.)
* 0-DOSE_SIGNIFICANCE_LEVEL=    ! (optional for radiation damage correction)
* SAVE_CORRECTION_IMAGES=        ! Default is TRUE. If FALSE, don't write DECAY*.cbf MODPIX*.cbf ABSORP*.cbf


== Radiation damage correction ==
== Radiation damage correction ==
Line 56: Line 64:
=== based on resolution shell and frame number ===
=== based on resolution shell and frame number ===


The usual (like in MOSFLM and other programs) correction based on resolution shell and frame number is performed in [[XDS]] as part of the CORRECT step - it can be switched off by omitting DECAY from the default CORRECTIONS= DECAY MODULATION ABSORP). This correction is also available from XSCALE.  
The usual correction (like in AIMLESS and SCALEPACK) based on resolution shell and frame number is performed in [[XDS]] as part of the CORRECT step - it can be switched off by omitting DECAY from the default CORRECTIONS= DECAY MODULATION ABSORP. DECAY correction is also the default in XSCALE.  


It is instructive to inspect DECAY.pck (using "VIEW DECAY.pck"). This visualizes the scale factors employed by the CORRECT step (the equivalent files from XSCALE are called DECAY_*.pck); the right sidebar gives the mapping between colours and numbers. Along the horizontal axis the frame number is shown, along the vertical axis the resolution shell.
It is instructive to inspect DECAY.cbf (using "XDS-Viewer DECAY.cbf"). This visualizes the scale factors employed by the CORRECT step (the equivalent files from XSCALE are called DECAY_*.cbf); the right sidebar gives the mapping between shades of gray, and numbers (1000 corresponds to a scalefactor of 1). Along the horizontal axis the frame number (or rather the batch number) is shown, along the vertical axis the resolution shell.


=== for individual reflections ===
=== for individual reflections: zero-dose extrapolation ===


To "switch on" radiation damage correction of individual reflections ([http://dx.doi.org/10.1107/S0907444903006516 K. Diederichs, S. McSweeney and R. B. G. Ravelli (2003) Zero-dose extrapolation as part of macromolecular synchrotron data reduction. ''Acta Cryst.'' '''D59''', 903-909]) it suffices to use the CRYSTAL_NAME keyword. '''The CRYSTAL_NAME parameters of different datasets do not have to be different'''. If they are different, this results in more degrees of freedom (namely, the slopes of the reflection intensity as a function of dose) for the program to fit the observed changes of intensities which are induced by radiation damage. However, if the datasets are based on the same crystal, or the datasets are based on crystals from the same drop, it is reasonable to assume that the slopes are the same.  
To "switch on" radiation damage correction of individual reflections ([http://dx.doi.org/10.1107/S0907444903006516 K. Diederichs, S. McSweeney and R. B. G. Ravelli (2003) Zero-dose extrapolation as part of macromolecular synchrotron data reduction. ''Acta Cryst.'' '''D59''', 903-909]) it suffices to use the CRYSTAL_NAME keyword. '''The CRYSTAL_NAME parameters of different datasets do not have to be different'''. If they are different, this results in more degrees of freedom (namely, the slopes of the reflection intensity as a function of dose) for the program to fit the observed changes of intensities which are induced by radiation damage. However, if the datasets are based on the same crystal, or the datasets are based on crystals from the same drop, it is reasonable to assume that the slopes are the same.  
Line 80: Line 88:
   CRYSTAL_NAME=Pt
   CRYSTAL_NAME=Pt


'''A word of warning''': even if the internal quality indicators (R-factors) are better when using this feature, there is no guarantee that the resulting intensities will actually be better suited for your purposes than those obtained without it. In particular, extrapolating to the ends of the dose interval (0 dose and full dose) decreases the precision of the intensities. The optimal points for interpolation are near 1/4 and near 3/4 of the total dose.
'''A word of warning''': even if the internal quality indicators (R-factors) are better when using this feature, there is no guarantee that the resulting intensities will actually be better suited for your purposes than those obtained without it. In particular, extrapolating to the ends of the dose interval (0 dose and full dose) decreases the precision of the intensities.  
 
=== Optimal values of dose, for interpolation ===
 
The optimal points for interpolation are near 1/4 and near 3/4 of the total dose. Details are published in [http://dx.doi.org/10.1107/S0021889808036716 Diederichs, K., Junk, M. (2009) Post-processing intensity measurements at favourable dose values. ''J. Appl. Cryst.'' '''42''', 48-57].
 
To interpolate to 22% of the full dose, one has to give a STARTING_DOSE less than zero:
OUTPUT_FILE=hg.ahkl
  INPUT_FILE= ../xds-hg/XDS_ASCII.HKL  ! a mercury soak
  CRYSTAL_NAME=Hg
  STARTING_DOSE=-22.*  ! assuming the dataset has 100 frames
Explanation: the interpolation is done towards 0, and by defining the start of the dataset to be at -22., one tells the program to calculate (by interpolation) intensity values that would be obtained at dose 0 which in reality is near frame 22.
 
Another example: by defining STARTING_DOSE=-78.* one would tell the program to calculate, by interpolation, those intensity values that correspond to those that would be obtained near frame 78.
 
 
 
== Scaling many datasets ==
When scaling e.g. hundreds of partial datasets, XSCALE may finish with an error message !!! ERROR !!! INSUFFICIENT NUMBER OF COMMON STRONG REFLECTIONS . This usually indicates that one or more datasets have too few reflections. Please inspect the table
<nowiki>
DATA    MEAN      REFLECTIONS        INPUT FILE NAME
SET# INTENSITY  ACCEPTED REJECTED
</nowiki>
and check the column "ACCEPTED REFLECTIONS". Then remove the dataset(s) with fewest accepted reflections, and re-run the program. Repeat if necessary.


== A hint for long-time XSCALE users ==
XSCALE makes it explicit which dataset(s) it cannot scale; it prints out e.g. "no common reflections with data set          197".


The latest versions do not require
XSCALE may also finish with the error message !!! ERROR !!! INACCURATE SCALING FACTORS. This usually indicates that one or more datasets are linearly depending on others (this happens if the ''same'' data are included more than once as INPUT_FILE), or are pure noise.
SPACE_GROUP_NUMBER=
UNIT_CELL_PARAMETERS=
in XSCALE.INP because these parameters are picked up from the header of the first input reflection file.

Latest revision as of 11:11, 20 March 2024

Simple and advanced usage

XSCALE is the stand-alone scaling program of the XDS suite. It scales reflection files (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT step of XDS already scales an individual dataset, XSCALE is only needed if several datasets should be scaled relative to another. However, it does not deterioriate (over-fit) a dataset if it is "scaled again" in XSCALE, since the supporting points of the scalefactors are at the same positions in detector and batch space.

One advantage of using XSCALE for a single dataset is that the user can specify the number and limits of the resolution shells. Another is that zero-dose extrapolation can be done.

At the XDS website, there is a short and a long commented example of XSCALE.INP


A minimal input file to combine two datasets into one file is:

OUTPUT_FILE=fae-native.ahkl 
INPUT_FILE= ../fae-native/xds_1/XDS_ASCII.HKL
INPUT_FILE= ../fae-native/xds_2/XDS_ASCII.HKL

Several output files can be specified (together with their set of input files) in a single run of XSCALE, simply by concatenation of sections like the above. All output files are then on the same scale - a program feature recommended for MAD data sets:

OUTPUT_FILE=fae-rh.ahkl 
INPUT_FILE= ../fae-rh/xds_1/XDS_ASCII.HKL
FRIEDEL'S_LAW=FALSE   
STRICT_ABSORPTION_CORRECTION=TRUE         ! see XDSwiki:Tips_and_Tricks

! the star in front of the file name indicates that it is the reference wrt falloff

INPUT_FILE= *../fae-rh/xds_2/XDS_ASCII.HKL
FRIEDEL'S_LAW=FALSE
STRICT_ABSORPTION_CORRECTION=TRUE

OUTPUT_FILE=fae-ip.ahkl 
INPUT_FILE= ../fae-ip/xds_1/XDS_ASCII.HKL
FRIEDEL'S_LAW=FALSE
STRICT_ABSORPTION_CORRECTION=TRUE
INPUT_FILE= ../fae-ip/xds_2/XDS_ASCII.HKL
FRIEDEL'S_LAW=FALSE
STRICT_ABSORPTION_CORRECTION=TRUE

Further keywords

  • RESOLUTION_SHELLS=  ! for the printout of R-factors, completeness, ...
  • SPACE_GROUP_NUMBER=  ! if not given, picked up from first input reflection file
  • UNIT_CELL_CONSTANTS=  ! if not given, picked up from first input reflection file

keywords with the same meaning as in CORRECT

  • REIDX=
  • REFERENCE_DATA_SET=  ! see also REFERENCE_DATA_SET
  • MINIMUM_I/SIGMA=
  • REFLECTIONS/CORRECTION_FACTOR=
  • FRIEDEL'S_LAW=
  • STRICT_ABSORPTION_CORRECTION=
  • INCLUDE_RESOLUTION_RANGE=
  • MAXIMUM_NUMBER_OF_PROCESSORS=
  • CORRECTIONS=
  • NBATCH=

keywords unique to XSCALE

  • REIDX_ISET=  ! re-index data from the most recent INPUT_FILE
  • MERGE=  ! average intensities from all input files, applies to output file
  • WEIGHT=  ! applies to input file
  • CRYSTAL_NAME=  ! switch on radiation damage correction for individual reflections (f.i.r.)
  • STARTING_DOSE=  ! (optional for radiation damage correction f.i.r.)
  • DOSE_RATE=  ! (optional for radiation damage correction f.i.r.)
  • 0-DOSE_SIGNIFICANCE_LEVEL=  ! (optional for radiation damage correction f.i.r.)
  • SAVE_CORRECTION_IMAGES=  ! Default is TRUE. If FALSE, don't write DECAY*.cbf MODPIX*.cbf ABSORP*.cbf

Radiation damage correction

based on resolution shell and frame number

The usual correction (like in AIMLESS and SCALEPACK) based on resolution shell and frame number is performed in XDS as part of the CORRECT step - it can be switched off by omitting DECAY from the default CORRECTIONS= DECAY MODULATION ABSORP. DECAY correction is also the default in XSCALE.

It is instructive to inspect DECAY.cbf (using "XDS-Viewer DECAY.cbf"). This visualizes the scale factors employed by the CORRECT step (the equivalent files from XSCALE are called DECAY_*.cbf); the right sidebar gives the mapping between shades of gray, and numbers (1000 corresponds to a scalefactor of 1). Along the horizontal axis the frame number (or rather the batch number) is shown, along the vertical axis the resolution shell.

for individual reflections: zero-dose extrapolation

To "switch on" radiation damage correction of individual reflections (K. Diederichs, S. McSweeney and R. B. G. Ravelli (2003) Zero-dose extrapolation as part of macromolecular synchrotron data reduction. Acta Cryst. D59, 903-909) it suffices to use the CRYSTAL_NAME keyword. The CRYSTAL_NAME parameters of different datasets do not have to be different. If they are different, this results in more degrees of freedom (namely, the slopes of the reflection intensity as a function of dose) for the program to fit the observed changes of intensities which are induced by radiation damage. However, if the datasets are based on the same crystal, or the datasets are based on crystals from the same drop, it is reasonable to assume that the slopes are the same. Example:

OUTPUT_FILE=fae-merge.ahkl 
  INPUT_FILE= ../fae-ip/xds_1/XDS_ASCII.HKL  !
  CRYSTAL_NAME=ip
  INPUT_FILE= ../fae-ip/xds_2/XDS_ASCII.HKL  ! same crystal, but translated along z
  CRYSTAL_NAME=ip

This is the recommended way as it reduces overfitting.

If, however, the crystals represent different heavy atom soaks, it is advisable to give a different CRYSTAL_NAME to each dataset. Example:

OUTPUT_FILE=hg.ahkl 
  INPUT_FILE= ../xds-hg/XDS_ASCII.HKL  ! a mercury soak
  CRYSTAL_NAME=Hg

OUTPUT_FILE=pt.ahkl
  INPUT_FILE= ../xds-pt/XDS_ASCII.HKL  ! a platinum soak
  CRYSTAL_NAME=Pt

A word of warning: even if the internal quality indicators (R-factors) are better when using this feature, there is no guarantee that the resulting intensities will actually be better suited for your purposes than those obtained without it. In particular, extrapolating to the ends of the dose interval (0 dose and full dose) decreases the precision of the intensities.

Optimal values of dose, for interpolation

The optimal points for interpolation are near 1/4 and near 3/4 of the total dose. Details are published in Diederichs, K., Junk, M. (2009) Post-processing intensity measurements at favourable dose values. J. Appl. Cryst. 42, 48-57.

To interpolate to 22% of the full dose, one has to give a STARTING_DOSE less than zero:

OUTPUT_FILE=hg.ahkl 
  INPUT_FILE= ../xds-hg/XDS_ASCII.HKL  ! a mercury soak
  CRYSTAL_NAME=Hg
  STARTING_DOSE=-22.*   ! assuming the dataset has 100 frames

Explanation: the interpolation is done towards 0, and by defining the start of the dataset to be at -22., one tells the program to calculate (by interpolation) intensity values that would be obtained at dose 0 which in reality is near frame 22.

Another example: by defining STARTING_DOSE=-78.* one would tell the program to calculate, by interpolation, those intensity values that correspond to those that would be obtained near frame 78.


Scaling many datasets

When scaling e.g. hundreds of partial datasets, XSCALE may finish with an error message !!! ERROR !!! INSUFFICIENT NUMBER OF COMMON STRONG REFLECTIONS . This usually indicates that one or more datasets have too few reflections. Please inspect the table

DATA    MEAN       REFLECTIONS        INPUT FILE NAME
 SET# INTENSITY  ACCEPTED REJECTED
 

and check the column "ACCEPTED REFLECTIONS". Then remove the dataset(s) with fewest accepted reflections, and re-run the program. Repeat if necessary.

XSCALE makes it explicit which dataset(s) it cannot scale; it prints out e.g. "no common reflections with data set 197".

XSCALE may also finish with the error message !!! ERROR !!! INACCURATE SCALING FACTORS. This usually indicates that one or more datasets are linearly depending on others (this happens if the same data are included more than once as INPUT_FILE), or are pure noise.