Xdscc12: Difference between revisions

m
 
(3 intermediate revisions by the same user not shown)
Line 2: Line 2:
XDSCC12 is a program for generating [[CC1/2|delta-CC<sub>1/2</sub>]] and delta-CC<sub>1/2-ano</sub> values for XDS_ASCII.HKL (written by [[XDS]]), or for XSCALE.HKL (written by [[XSCALE]]) containing data from several files of type XDS_ASCII.HKL after scaling (with MERGE=FALSE).  
XDSCC12 is a program for generating [[CC1/2|delta-CC<sub>1/2</sub>]] and delta-CC<sub>1/2-ano</sub> values for XDS_ASCII.HKL (written by [[XDS]]), or for XSCALE.HKL (written by [[XSCALE]]) containing data from several files of type XDS_ASCII.HKL after scaling (with MERGE=FALSE).  


It implements the method described in Assmann, Brehm and Diederichs (2016) Identification of rogue datasets in serial crystallography. J. Appl. Cryst. 49, 1021 [http://journals.iucr.org/j/issues/2016/03/00/zw5005/zw5005.pdf], and it does this not only for the individual datasets in XSCALE.HKL, but also for individual frames, or groups of frames, of a single dataset collected with the rotation method and processed by [[XDS]].
It implements the method described in Assmann, Brehm and Diederichs (2016) Identification of rogue datasets in serial crystallography. J. Appl. Cryst. 49, 1021 [http://journals.iucr.org/j/issues/2016/03/00/zw5005/zw5005.pdf], and it does this not only for the individual datasets in XSCALE.HKL, but also for individual frames, or groups (batches) of frames, of a single dataset collected with the rotation method and processed by [[XDS]].


The program can be downloaded for [https://{{SERVERNAME}}/pub/linux_bin/xdscc12 Linux] or [https://{{SERVERNAME}}/pub/mac_bin/xdscc12 Mac].
The program can be downloaded for [https://{{SERVERNAME}}/pub/linux_bin/xdscc12 Linux] or [https://{{SERVERNAME}}/pub/mac_bin/xdscc12 Mac].
Line 8: Line 8:
Usage (this text can be obtained with <code>xdscc12 -h</code>):
Usage (this text can be obtained with <code>xdscc12 -h</code>):
<pre>
<pre>
xdscc12 KD 2019-04-30. Academic use only; no redistribution. -h option shows options.
xdscc12 KD 2023-01-08. Academic use only; no redistribution. -h option shows options.
Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028
Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028
running 'xdscc12 -h' on 20190502 at 16:11:46 +0200
running 'xdscc12' on 20240403 at 16:49:03 +0200
usage: xdscc12 [-dmin <lowres>] [-dmax <highres>] [-nbin <nbin>] [-mode <1 or 2>] [-<abcdefstwz>] [-r <ref>] FILE_NAME
usage: xdscc12 [-dmin <lowres>] [-dmax <highres>] [-nbin <nbin>] [-mode <1 or 2>] [-<abcdefstwz>] [-r|-R <ref>] FILE_NAME
dmin (default 999A), dmax (default 1A) and nbin (default 10) have the usual meanings.
dmin (default 999A), dmax (default 1A) and nbin (default 10) have the usual meanings.
mode can be 1 (equal volumes of resolution shells) or 2 (increasing volumes; default).
mode can be 1 (equal volumes of resolution shells) or 2 (increasing volumes; default).
   -r: next parameter: ASCII reference file with lines: h,k,l,Fcalc or h,k,l,Fcalc+,Fcalc-
   -r: <ref> is ASCII reference file with lines: h,k,l,Fref or h,k,l,Fref+,Fref-
       this allows calculation of CC of isomorphous signal with reference
  -R: <ref> is ASCII reference file with lines: h,k,l,Iref or h,k,l,Iref+,Iref-
   -s: read two columns from reference: Fcalc(+), Fcalc(-).  
       -r and -R allow calculation of CC of isomorphous signal with reference
   -s: read two columns from reference: Fref+, Fref- or Iref+, Iref-.  
       this allows calculation of CC of anomalous signal with that of reference
       this allows calculation of CC of anomalous signal with that of reference
  -A: sort INPUT_FILEs in XSCALE.INP.rename_me by anomalous instead of isomorphous delta-CC1/2
   -t: total oscillation (degree) to batch fine-sliced frames into
   -t: total oscillation (degree) to batch fine-sliced frames into
  FILE_NAME can be XDS or XSCALE reflection file
  FILE_NAME can be XDS or XSCALE reflection file
Line 29: Line 31:
   -w: weighting of intensities with their sigmas
   -w: weighting of intensities with their sigmas
   -z: Fisher transformation of delta-CC1/2 values
   -z: Fisher transformation of delta-CC1/2 values
The program writes a commented XSCALE.INP.rename_me that is sorted on delta-CC1/2
</pre>
</pre>
The program output in the terminal window is terse but supposed to be self-explanatory; it can (and most often should) be saved or re-directed to a file.
xdscc12 ... > xdscc12.log  #  or xdscc12 ... | tee xdscc12.log
All statistics (tables) produced by XDSCC12 may be visualized with e.g. gnuplot, after grepping the relevant lines from the output.
If XDSCC12 is used with a XDS_ASCII.HKL reflection file (from XDS), the isomorphous delta-CC<sub>1/2</sub> of a batch of frames (width chosen with the <code>-t</code> option; typically <code>-t 1</code>) relative to all data is most easily visualized via [[XDSGUI]] (Statistics tab). Negative numbers indicate a worsening of the overall signal.
If XDSCC12 is used with a XSCALE.HKL generated from multiple datasets, the output lines show the contribution of each dataset toward the total CC<sub>1/2</sub>. In this case, the program writes a file called XSCALE.INP.rename_me which shows statistics of delta-CC<sub>1/2</sub> and delta-CC<sub>1/2-ano</sub> values, and has a sorted enumeration of the INPUT_FILEs - the first of these provides the best data set, and the last one is the worst one. This XSCALE.INP.rename_me can then be edited (i.e. for deleting a few data sets with very negative delta-CC<sub>1/2</sub>), and renamed to XSCALE.INP.
Statistics are given (in resolution shells) for the isomorphous and the anomalous signal. In case of [[SSX]] data (which have few reflections per data set, compared to complete data sets), we typically use <code>-nbin 1</code> as option.
To find out about the influence of the ''a'' and ''b'' parameters of the XDS/XSCALE-adjusted error model, you may try the <code>-w</code> option; this assigns the same sigma to all reflections. Likewise, the [https://en.wikipedia.org/wiki/Fisher_transformation Fisher transformation], which serves to make changes in CC<sub>1/2</sub> comparable across resolution ranges, may be switched off for testing purposes, with the -z option.


== Example output ==
== Example output ==
Line 283: Line 276:
INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0019/xds/XDS_ASCII.HKL
INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0019/xds/XDS_ASCII.HKL
</pre>
</pre>
== Explanation of output ==
The program output in the terminal window is terse but supposed to be self-explanatory; it can (and most often should) be saved or re-directed to a file.
xdscc12 ... > xdscc12.log  #  or xdscc12 ... | tee xdscc12.log
All statistics (tables) produced by XDSCC12 may be visualized with e.g. gnuplot, after grepping the relevant lines from the output.
If XDSCC12 is used with a XDS_ASCII.HKL reflection file (from XDS), the isomorphous delta-CC<sub>1/2</sub> of a batch of frames (width chosen with the <code>-t</code> option; typically <code>-t 1</code>) relative to all data is most easily visualized via [[XDSGUI]] (Statistics tab). Negative numbers indicate a worsening of the overall signal.
If XDSCC12 is used with a XSCALE.HKL generated from multiple datasets, the output lines show the contribution of each dataset toward the total CC<sub>1/2</sub>. In this case, the program writes a file called XSCALE.INP.rename_me which shows statistics of delta-CC<sub>1/2</sub> and delta-CC<sub>1/2-ano</sub> values, and has a sorted enumeration of the INPUT_FILEs - the first of these provides the best data set, and the last one is the worst one. This XSCALE.INP.rename_me can then be edited (i.e. for deleting a few data sets with strongly negative delta-CC<sub>1/2</sub>), and renamed to XSCALE.INP . Only delete the clearly worst data sets, and not more than 10% of the existing ones! This procedure can be iterated, i.e. after another round of XSCALE, XDSCC12 could be run again.
Overall statistics are reported in the lines starting with <code>a</code> and <code>d</code> for
* <b>only</b> those unique reflections that are actually present in the batch of frame / batch / dataset. These values are in columns 3-6.
* <b>all</b> unique reflections of the merged dataset (but a frame / batch / dataset may not have all unique reflections, so the "all" values report the mean influence). These values are in columns 7-10.
Typically, it is sensible to disregard the "all" values, and to base decisions on the "only" values, because the latter are not affected by the number of reflections of the particular frame / batch / dataset. The words "all" and "only" are used in this sense throughout the terminal and file output of XDSCC12.
Statistics for "only" the unique reflections of a frame / batch/ dataset are given in resolution shells for the isomorphous (in lines starting with <code>b</code> and <code>c</code>) and the anomalous signal (in lines starting with <code>d</code> and <code>e</code>). In case of [[SSX]] data (which have few reflections per data set, compared to complete data sets), we typically use <code>-nbin 1</code> as option, to define only a single resolution shell.
To find out about the influence of the ''a'' and ''b'' parameters of the XDS/XSCALE-adjusted error model, you may try the <code>-w</code> option; this assigns the same sigma to all reflections. Likewise, the [https://en.wikipedia.org/wiki/Fisher_transformation Fisher transformation], which serves to make changes in CC<sub>1/2</sub> comparable across resolution ranges, may be switched off for testing purposes, with the <code>-z</code> option.


== Correlation against a reference data set (-r <reference> option) ==
== Correlation against a reference data set (-r <reference> option) ==
Line 288: Line 298:
To prepare a reference data set if the refinement was done with phenix.refine, one could use e.g.
To prepare a reference data set if the refinement was done with phenix.refine, one could use e.g.
<pre>
<pre>
mtz2various hklin 2bn3_refine_001.mtz hklout temp.hkl <<eof
mtz2various hklin 2bn3_refine_001.mtz hklout reference.hkl <<eof
OUTPUT USER *
OUTPUT USER *
LABIN FC=F-model PHIC=PHIF-model
LABIN FC=F-model PHIC=PHIF-model
Line 298: Line 308:
sftools
sftools
read mymodel_001.mtz
read mymodel_001.mtz
write temp.hkl format(3i5,f10.3) col F-model
write reference.hkl format(3i5,f10.3) col F-model
y
y
quit
quit
</pre>
</pre>
For a Refmac-written MTZ file, you would use "col FC_ALL" instead of "col F-model".


=== Reference data with anomalous signal (additional -s option) ===
=== Reference data with anomalous signal (additional -s option) ===
2,652

edits