XDSCC12 is a program for generating delta-CC1/2 and delta-CC1/2-ano values for XDS_ASCII.HKL (written by XDS), or for XSCALE.HKL (written by XSCALE) containing data from several files of type XDS_ASCII.HKL after scaling (with MERGE=FALSE).

It implements the method described in Assmann, Brehm and Diederichs (2016) Identification of rogue datasets in serial crystallography. J. Appl. Cryst. 49, 1021 [1], and it does this not only for the individual datasets in XSCALE.HKL, but also for individual frames, or groups of frames, of a single dataset collected with the rotation method and processed by XDS.

The program can be downloaded for Linux or Mac.

Usage (this text can be obtained with xdscc12 -h):

xdscc12 KD 2019-04-30. Academic use only; no redistribution. -h option shows options.
Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028
running 'xdscc12 -h' on 20190502 at 16:11:46 +0200
usage: xdscc12 [-dmin <lowres>] [-dmax <highres>] [-nbin <nbin>] [-mode <1 or 2>] [-<abcdefstwz>] [-r <ref>] FILE_NAME
dmin (default 999A), dmax (default 1A) and nbin (default 10) have the usual meanings.
mode can be 1 (equal volumes of resolution shells) or 2 (increasing volumes; default).
  -r: next parameter: ASCII reference file with lines: h,k,l,Fcalc or h,k,l,Fcalc+,Fcalc-
      this allows calculation of CC of isomorphous signal with reference
  -s: read two columns from reference: Fcalc(+), Fcalc(-). 
      this allows calculation of CC of anomalous signal with that of reference
  -t: total oscillation (degree) to batch fine-sliced frames into
 FILE_NAME can be XDS or XSCALE reflection file
 other options can be combined (e.g. -def), and switch the following OFF:
  -a: individual isomorphous summary values
  -b: individual (Fisher-transformed) delta-CC1/2 values
  -c: individual delta-CC1/2 reflection numbers
  -d: individual anomalous summary values
  -e: individual (Fisher-transformed) delta-CC1/2ano values
  -f: individual delta-CC1/2ano reflection numbers
  -w: weighting of intensities with their sigmas
  -z: Fisher transformation of delta-CC1/2 values

The program output in the terminal window is terse but supposed to be self-explanatory; it can (and most often should) be saved or re-directed to a file.

xdscc12 ... > xdscc12.log  #  or xdscc12 ... | tee xdscc12.log

All statistics (tables) produced by XDSCC12 may be visualized with e.g. gnuplot, after grepping the relevant lines from the output. If XDSCC12 is used with a XDS_ASCII.HKL reflection file (from XDS), the isomorphous delta-CC1/2 of a batch of frames (width chosen with the -t option; typically -t 1) relative to all data is most easily visualized via XDSGUI (Statistics tab). Negative numbers indicate a worsening of the overall signal.

If XDSCC12 is used with a XSCALE.HKL generated from multiple datasets, the output lines show the contribution of each dataset toward the total CC1/2. In this case, the program writes a file called XSCALE.INP.rename_me which shows statistics of delta-CC1/2 and delta-CC1/2-ano values, and has a sorted enumeration of the INPUT_FILEs - the first of these provides the best data set, and the last one is the worst one. This XSCALE.INP.rename_me can then be edited (i.e. for deleting a few data sets with very negative delta-CC1/2), and renamed to XSCALE.INP.

Statistics are given (in resolution shells) for the isomorphous and the anomalous signal. In case of SSX data (which have few reflections per data set, compared to complete data sets), we typically use -nbin 1 as option.

To find out about the influence of the a and b parameters of the XDS/XSCALE-adjusted error model, you may try the -w option; this assigns the same sigma to all reflections. Likewise, the Fisher transformation, which serves to make changes in CC1/2 comparable across resolution ranges, may be switched off for testing purposes, with the -z option.

Example output with explanation

xdscc12 KD 2020-12-9. Academic use only; no redistribution. -h option shows options.
Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028
 running 'xdscc12 temp.ahkl' on 20220413 at 12:13:53 +0200
 no option -w found, therefore statistics are weighted by sigma values
 no option -z found, therefore delta-CC1/2 values are Fisher-transformed


 reflection file is temp.ahkl
!SPACE_GROUP_NUMBER=   19
!UNIT_CELL_CONSTANTS=     38.30     79.10     79.10  90.000  90.000  90.000
 # of datasets=          20
 # obs (w/o misfits), unique, misfits =       51918       20213           0
 max and min resolution of data in file =   39.55000       1.801329    
 data between   39.55000      and   1.801329     A will be used
 10 resolution shells (for lines starting with b,c,e,f,r,s):
  5.644  4.011  3.281  2.844  2.545  2.324  2.152  2.013  1.899  1.801


 overall CC1/2:    83.328 nref=   14986 (but the overall CC1/2 is meaningless!)
 <CC1/2>:    44.468 (frequency-weighted average of CC1/2 in resolution shells)
 CC1/2 in resolution shells:
   91.4   81.0   72.0   68.4   42.1   41.4   32.4   29.0   33.2   28.2
 CC* in resolution shells:
   97.7   94.6   91.5   90.1   77.0   76.5   70.0   67.0   70.6   66.3
 frequency, i.e. number of unique reflections in resolution shells:
    493    887   1134   1345   1501   1649   1808   1921   2061   2187


 headings for lines starting with a,b,c:
a:  <CC1/2> of each dataset:
a:   reflections of this dataset only      reflections of all datasets
a: set   nref    with without   delta     nref    with without   delta
b: delta-CC1/2 in resolution shells
c: # reflections for delta-CC1/2
a      1   1241  61.936  42.634  26.227    14470  44.075  42.523   1.910
b 61.432 27.680 8.094 2.942 41.231 44.296 26.090 10.996 30.482 30.456
c     47     73     85    110    135    137    146    170    164    174
a      2   1565  43.941  34.156  11.512    14350  44.978  44.104   1.091
b 36.910 -5.982 -1.113 58.592 46.243 -24.157 -.234 20.068 11.307 -3.709
c     49     90    126    134    158    168    196    203    216    225
a      3   1754  39.551  29.634  11.233    14478  43.683  42.374   1.606
b 13.559 11.752 17.587 17.093 16.730 19.779 3.638 20.348 6.182 4.113
c     43    107    131    160    175    198    218    230    246    246
a      4   1468  41.694  25.840  17.768    14382  44.801  43.256   1.917
b -9.445 36.760 36.335 15.954 2.951 20.264 21.375 19.554 2.513 35.809
c     30     93     99    115    164    174    202    220    188    183
a      5   1412  41.785  39.193   3.100    14292  45.561  45.291   0.340
b 21.421 1.775 7.645 1.361 6.005 22.513 11.859 -16.883 .973 -2.353
c     41     68     82    111    144    173    176    190    210    217
a      6   1363  49.626  42.827   8.634    14293  44.938  44.354   0.728
b 32.175 53.390 25.822 21.623 -42.441 27.299 7.214 12.080 6.573 15.115
c     43     84     99    120    147    161    173    183    166    187
a      7   1686  48.062  37.817  12.521    14407  43.991  42.971   1.258
b 11.830 -4.851 11.505 15.574 67.688 20.009 5.129 -7.014 8.766 1.024
c     59    117    133    153    142    179    183    243    252    225
a      8   1795  46.357  34.049  14.614    14433  44.757  43.282   1.829
b 14.234 6.729 13.626 19.758 7.065 16.373 21.204 11.431 18.082 15.739
c     57    103    137    183    168    202    205    240    237    263
a      9   1483  50.778  46.558   5.526    14363  44.923  44.479   0.554
b 40.845 .431 30.046 -7.049 -1.607 -2.724 -9.486 -.902 20.019 27.538
c     46     88    119    135    154    161    193    192    203    192
a     10   1332  38.220  34.823   3.918    14506  43.988  43.550   0.541
b 33.078 34.752 9.425 25.010 -2.922 -14.118 14.600 12.844 1.512 -8.735
c     51     77     93    118    123    154    166    172    181    197
a     11   1477  45.163  38.543   8.015    14361  45.415  44.577   1.051
b 9.914 8.880 2.467 18.150 5.381 11.874 4.743 29.698 14.834 -17.211
c     65     82    111    133    115    159    174    202    220    216
a     12   1654  45.251  30.457  17.159    14409  44.363  42.437   2.373
b 28.681 6.430 -40.284 36.914 30.008 47.293 29.607 9.512 -1.455 21.488
c     53    109    116    142    154    177    196    229    235    243
a     13   1422  51.281  36.632  18.036    14302  44.261  42.742   1.872
b 23.460 4.559 43.871 13.497 28.091 35.298 -1.493 17.764 17.226 19.655
c     38     73    101    123    153    153    169    192    224    196
a     14   1668  51.749  49.607   2.882    14441  44.642  44.527   0.145
b 5.096 37.325 5.515 16.257 21.437 -7.901 14.955 4.960 -15.485 -2.206
c     52    115    121    161    172    177    201    205    223    241
a     15   1369  46.667  38.742   9.674    14263  44.480  43.770   0.882
b -22.140 29.901 48.902 -7.002 26.620 .483 8.071 13.097 11.535 -5.181
c     56     92    105    115    116    138    151    177    196    223
a     16   1257  33.100  23.275  10.645    14291  44.849  43.826   1.273
b -14.293 -.418 44.139 -1.828 30.203 21.053 5.098 22.834 -4.129 -.948
c     34     60     57    120    133    127    159    188    199    180
a     17   1370  42.421  39.259   3.794    14264  44.877  44.550   0.409
b 30.851 -17.854 11.508 12.578 -18.416 2.277 20.934 12.146 -.761 .881
c     33     72    106    131    134    151    157    189    197    200
a     18   1269  45.248  26.342  21.465    14302  43.295  41.438   2.264
b 23.189 2.571 31.364 37.883 60.830 25.784 13.345 33.459 -3.441 5.251
c     45    101     97    111    127    143    144    164    168    169
a     19   1644  43.200  43.114   0.105    14365  45.169  45.113   0.071
b 23.714 13.389 29.752 38.784 -16.069 6.952 10.177 9.560 -19.196 -16.013
c     56     87    132    138    168    179    183    215    236    250
a     20   1615  54.027  42.202  15.317    14366  45.818  44.496   1.661
b 20.068 20.724 30.458 21.475 33.352 28.157 5.646 16.916 9.398 8.795
c     52     87    131    149    156    181    195    200    230    234
 -------------------------------------------------------------


overall CC1/2ano:   -25.548 nref=    2668 (but the overall value is meaningless!)
<CC1/2ano> :   -21.370 (frequency-weighted average of CC1/2ano in resolution shells)
CC1/2ano in resolution shells:
  -75.0   -1.7  -30.1  -41.3  -38.8  -20.1  -18.2   -8.6  -17.7   -9.6
frequency i.e. number of unique Friedel pairs:
     73    140    168    241    262    320    319    351    397    397


 headings for lines starting with d,e,f:
d: <CC1/2ano> of each dataset:
d: reflections of this dataset only      reflections of all datasets
d: set   nref    with without   delta     nref    with without   delta
e: delta-CC1/2ano in resolution shells
f: # reflections for delta-CC1/2ano
d     1    190 -30.590 -20.348 -10.922     2233 -17.969 -17.817  -0.157
e -42.962 -6.341 4.502 -41.515 -76.721 -36.825 43.947 19.535 21.803 -23.853
f     11     12      9     14     17     28     26     22     26     25
d     2    234 -16.480 -16.515   0.036     2336 -26.125 -26.132   0.008
e -3.116 -2.078 -3.249 -23.820 6.424 9.198 -3.791 1.123 6.966 -4.309
f      5     12     16     12     35     30     40     24     29     31
d     3    343 -40.435 -31.820  -9.887     2245 -22.659 -19.640  -3.160
e -1.713 12.234 -11.942 -63.179 5.125 -4.423 15.291 -30.082 .859 3.714
f      7     16     27     33     42     45     38     54     43     38
d     4    250 -32.340 -29.274  -3.387     2385 -19.104 -18.426  -0.702
e 22.408 37.540 12.328 -13.456 -2.729 1.911 -36.173 3.026 -6.757 12.916
f      3     13     12     25     39     42     41     30     23     22
d     5    179 -22.995 -24.947   2.070     2417 -26.768 -27.230   0.499
e 36.142 13.527 -10.456 -7.251 13.331 -.923 27.239 -4.208 -9.635 -10.625
f      5      4      7      6     19     22     29     30     23     34
d     6    188 -10.090 -13.603   3.562     2412 -21.564 -21.759   0.205
e -43.278 1.987 48.529 -2.536 35.294 -14.994 21.066 -6.607 -.446 -8.161
f      5      9      8     10     21     28     25     30     27     25
d     7    292 -34.206 -35.887   1.915     2316 -26.435 -26.723   0.309
e -19.814 34.972 .715 48.559 -11.683 -4.391 -14.145 5.337 -3.722 2.527
f     12     15     16     23     31     43     28     46     45     33
d     8    305  -6.575  -7.745   1.176     2353 -19.132 -19.241   0.113
e -7.804 -5.876 -10.001 4.878 -37.785 8.976 14.698 -4.716 9.080 11.786
f     11     17     17     26     28     45     34     38     42     47
d     9    283 -35.313 -32.348  -3.347     2398 -20.019 -19.463  -0.578
e -90.495 31.505 6.592 -45.999 -7.955 -11.621 6.013 -6.399 18.725 2.816
f      7     16     14     23     31     32     39     37     52     32
d    10    228 -31.230 -36.231   5.639     2184 -20.991 -21.921   0.975
e -10.975 26.119 -31.629 9.944 10.170 24.916 -19.125 25.874 -12.123 -.504
f      6     23     12     16     17     32     31     33     27     31
d    11    204 -25.646 -21.039  -4.870     2344 -26.292 -25.997  -0.317
e 8.152 1.052 -8.483 4.821 5.281 -.012 -33.636 6.064 -12.304 -7.890
f     13     11     16     14     19     25     24     28     26     28
d    12    314 -20.795 -27.734   7.363     2299 -21.005 -22.984   2.080
e -53.587 19.115 79.436 41.903 -3.288 4.497 22.135 12.845 -21.183 -1.646
f      4     15     16     27     34     37     34     46     54     47
d    13    233  -6.551  -6.622   0.071     2411 -21.747 -22.084   0.354
e 7.268 -17.245 31.471 30.207 .209 7.647 5.126 -61.071 16.081 14.792
f      4     11     15     20     20     34     38     34     27     30
d    14    321 -39.745 -43.950   5.095     2335 -21.004 -21.556   0.578
e 6.661 20.182 -13.578 -14.964 .663 20.951 -5.357 11.106 6.979 4.542
f     13     26     24     29     37     30     31     33     48     50
d    15    211 -25.574 -27.428   1.993     2451 -21.107 -22.106   1.049
e 9.034 24.046 51.709 10.903 25.126 -31.105 -39.654 -43.178 6.990 15.268
f      9     18     23     17     21     28     15     22     28     30
d    16    164 -28.372 -24.930  -3.704     2381 -18.839 -18.689  -0.156
e 20.336 23.610 -96.956 -10.020 7.356 1.404 14.821 -20.025 -4.886 -17.264
f      2      6      2      9     17     20     33     24     20     31
d    17    218 -13.441  -3.184 -10.302     2460 -20.794 -19.807  -1.029
e -13.966 -12.793 5.985 10.830 -5.428 -23.834 17.910 -9.981 -25.620 -22.550
f      2     13     11     19     16     22     26     38     34     37
d    18    136 -32.513 -31.586  -1.034     2379 -21.312 -21.014  -0.312
e 55.859 17.851 -8.273 -57.797 -68.191 27.216 15.219 15.258 -14.424 -10.376
f      4     14     12     16     14     19     18     13     16     10
d    19    295 -26.211 -37.786  12.848     2289 -19.913 -21.583   1.745
e -3.719 -15.248 39.777 -13.377 11.415 8.559 -15.482 8.791 25.058 49.701
f      8     13     17     33     30     35     40     38     38     43
d    20    309 -33.133 -33.400   0.300     2351 -21.683 -22.124   0.463
e -30.012 13.423 -2.376 15.330 -16.239 -2.850 -9.170 5.549 10.005 1.910
f      9      9     15     32     33     36     40     48     48     39


 best delta-CC1/2_only=   26.22747    
 median of delta-cc1/2 ("only" i.e. 6th col of "a" lines) =   10.93907    
 noise= (MAD, median absolute deviation) from this median =   5.816339    
 median of delta-cc1/2 ("all" i.e. 10th col of "a" lines) =   1.174427    
 noise= (MAD, median absolute deviation) from this median =  0.6438574    
 median of delta-cc1/2-ano ("only" i.e. 6th col of "d" lines) =  0.1854570    
 noise= (MAD, median absolute deviation) from this median =   3.552712    
 median of delta-cc1/2-ano ("all" i.e. 10th col of "d" lines) =  0.1588948    
 noise= (MAD, median absolute deviation) from this median =  0.4448406    


Wrote a commented XSCALE.INP.rename_me that is sorted on delta-CC1/2 "only"
You may edit that file, or e.g. add lines after each INPUT_FILE line with sed '/INPUT_FILE/a INCLUDE_RESOLUTION_RANGE=99 3'
normal termination

Correlation against a reference data set (-r <reference> option)

The correlation of the experimental data set against the user-supplied reference data is shown in the lines starting with r. To prepare a reference data set if the refinement was done with phenix.refine, one could use e.g.

mtz2various hklin 2bn3_refine_001.mtz hklout temp.hkl <<eof
OUTPUT USER *
LABIN FC=F-model PHIC=PHIF-model
END
eof

- the column corresponding to PHIC will not be used by xdscc12. Alternatively,

sftools
read mymodel_001.mtz
write temp.hkl format(3i5,f10.3) col F-model
y
quit

Reference data with anomalous signal (additional -s option)

The correlation of the anomalous difference of the experimental data set against the anomalous signal of the user-supplied reference data is shown in the lines starting with s. A simple way to obtain Fcalc(+) and Fcalc(-) is to run phenix.refine with options (in case of S as anomalous scatterer)

refinement.input.xray_data.labels="F(+),SIGF(+),F(-),SIGF(-),merged"  refinement.refine.anomalous_scatterers.group.selection="element S" strategy=individual_sites+individual_adp+group_anomalous+occupancies

and then

sftools <<eof
read mymodel_001.mtz
write anom-reference.hkl format(3i5,2f10.3) col "F-model(+)" "F-model(-)"
y
quit
eof

in which case sftools outputs only the acentric reflections - only those have anomalous differences. XDSCC12 then has to be run with the -s -r anom-reference.hkl option.

See also

A complete description of how to process serial crystallography data with XDS/XSCALE is given in SSX.

xscale_isocluster is a program that implements the method of Brehm and Diederichs (2014) and theory of Diederichs (2017). It serves to identify groups of related datasets in a reflection file produced by XSCALE, and should be used before XDSCC12.

To remove bad frames from a XDS_ASCII.HKL file, you can re-INTEGRATE or just re-CORRECT with the keyword EXCLUDE_DATA_RANGE in XDS.INP.