Xdscc12
XDSCC12 is a program for generating delta-CC1/2 and delta-CC1/2-ano values for XDS_ASCII.HKL (written by XDS), or for XSCALE.HKL (written by XSCALE) containing data from several files of type XDS_ASCII.HKL after scaling (with MERGE=FALSE).
It implements the method described in Assmann, Brehm and Diederichs (2016) Identification of rogue datasets in serial crystallography. J. Appl. Cryst. 49, 1021 [1], and it does this not only for the individual datasets in XSCALE.HKL, but also for individual frames, or groups of frames, of a single dataset collected with the rotation method and processed by XDS.
The program can be downloaded for Linux or Mac.
Usage (this text can be obtained with xdscc12 -h
):
xdscc12 KD 2019-04-30. Academic use only; no redistribution. -h option shows options. Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028 running 'xdscc12 -h' on 20190502 at 16:11:46 +0200 usage: xdscc12 [-dmin <lowres>] [-dmax <highres>] [-nbin <nbin>] [-mode <1 or 2>] [-<abcdefstwz>] [-r <ref>] FILE_NAME dmin (default 999A), dmax (default 1A) and nbin (default 10) have the usual meanings. mode can be 1 (equal volumes of resolution shells) or 2 (increasing volumes; default). -r: next parameter: ASCII reference file with lines: h,k,l,Fcalc or h,k,l,Fcalc+,Fcalc- this allows calculation of CC of isomorphous signal with reference -s: read two columns from reference: Fcalc(+), Fcalc(-). this allows calculation of CC of anomalous signal with that of reference -t: total oscillation (degree) to batch fine-sliced frames into FILE_NAME can be XDS or XSCALE reflection file other options can be combined (e.g. -def), and switch the following OFF: -a: individual isomorphous summary values -b: individual (Fisher-transformed) delta-CC1/2 values -c: individual delta-CC1/2 reflection numbers -d: individual anomalous summary values -e: individual (Fisher-transformed) delta-CC1/2ano values -f: individual delta-CC1/2ano reflection numbers -w: weighting of intensities with their sigmas -z: Fisher transformation of delta-CC1/2 values
The program output in the terminal window is terse but supposed to be self-explanatory; it can (and most often should) be saved or re-directed to a file.
xdscc12 ... > xdscc12.log # or xdscc12 ... | tee xdscc12.log
All statistics (tables) produced by XDSCC12 may be visualized with e.g. gnuplot, after grepping the relevant lines from the output.
If XDSCC12 is used with a XDS_ASCII.HKL reflection file (from XDS), the isomorphous delta-CC1/2 of a batch of frames (width chosen with the -t
option; typically -t 1
) relative to all data is most easily visualized via XDSGUI (Statistics tab). Negative numbers indicate a worsening of the overall signal.
If XDSCC12 is used with a XSCALE.HKL generated from multiple datasets, the output lines show the contribution of each dataset toward the total CC1/2. In this case, the program writes a file called XSCALE.INP.rename_me which shows statistics of delta-CC1/2 and delta-CC1/2-ano values, and has a sorted enumeration of the INPUT_FILEs - the first of these provides the best data set, and the last one is the worst one. This XSCALE.INP.rename_me can then be edited (i.e. for deleting a few data sets with very negative delta-CC1/2), and renamed to XSCALE.INP.
Statistics are given (in resolution shells) for the isomorphous and the anomalous signal. In case of SSX data (which have few reflections per data set, compared to complete data sets), we typically use -nbin 1
as option.
To find out about the influence of the a and b parameters of the XDS/XSCALE-adjusted error model, you may try the -w
option; this assigns the same sigma to all reflections. Likewise, the Fisher transformation, which serves to make changes in CC1/2 comparable across resolution ranges, may be switched off for testing purposes, with the -z option.
Example output
xdscc12 KD 2020-12-9. Academic use only; no redistribution. -h option shows options. Please cite Assmann, G., Brehm, W., Diederichs, K. (2016) J.Appl.Cryst. 49, 1021-1028 running 'xdscc12 temp.ahkl' on 20220413 at 12:13:53 +0200 no option -w found, therefore statistics are weighted by sigma values no option -z found, therefore delta-CC1/2 values are Fisher-transformed reflection file is temp.ahkl !SPACE_GROUP_NUMBER= 19 !UNIT_CELL_CONSTANTS= 38.30 79.10 79.10 90.000 90.000 90.000 # of datasets= 20 # obs (w/o misfits), unique, misfits = 51918 20213 0 max and min resolution of data in file = 39.55000 1.801329 data between 39.55000 and 1.801329 A will be used 10 resolution shells (for lines starting with b,c,e,f,r,s): 5.644 4.011 3.281 2.844 2.545 2.324 2.152 2.013 1.899 1.801 overall CC1/2: 83.328 nref= 14986 (but the overall CC1/2 is meaningless!) <CC1/2>: 44.468 (frequency-weighted average of CC1/2 in resolution shells) CC1/2 in resolution shells: 91.4 81.0 72.0 68.4 42.1 41.4 32.4 29.0 33.2 28.2 CC* in resolution shells: 97.7 94.6 91.5 90.1 77.0 76.5 70.0 67.0 70.6 66.3 frequency, i.e. number of unique reflections in resolution shells: 493 887 1134 1345 1501 1649 1808 1921 2061 2187 headings for lines starting with a,b,c: a: <CC1/2> of each dataset: a: reflections of this dataset only reflections of all datasets a: set nref with without delta nref with without delta b: delta-CC1/2 in resolution shells c: # reflections for delta-CC1/2 a 1 1241 61.936 42.634 26.227 14470 44.075 42.523 1.910 b 61.432 27.680 8.094 2.942 41.231 44.296 26.090 10.996 30.482 30.456 c 47 73 85 110 135 137 146 170 164 174 a 2 1565 43.941 34.156 11.512 14350 44.978 44.104 1.091 b 36.910 -5.982 -1.113 58.592 46.243 -24.157 -.234 20.068 11.307 -3.709 c 49 90 126 134 158 168 196 203 216 225 a 3 1754 39.551 29.634 11.233 14478 43.683 42.374 1.606 b 13.559 11.752 17.587 17.093 16.730 19.779 3.638 20.348 6.182 4.113 c 43 107 131 160 175 198 218 230 246 246 a 4 1468 41.694 25.840 17.768 14382 44.801 43.256 1.917 b -9.445 36.760 36.335 15.954 2.951 20.264 21.375 19.554 2.513 35.809 c 30 93 99 115 164 174 202 220 188 183 a 5 1412 41.785 39.193 3.100 14292 45.561 45.291 0.340 b 21.421 1.775 7.645 1.361 6.005 22.513 11.859 -16.883 .973 -2.353 c 41 68 82 111 144 173 176 190 210 217 a 6 1363 49.626 42.827 8.634 14293 44.938 44.354 0.728 b 32.175 53.390 25.822 21.623 -42.441 27.299 7.214 12.080 6.573 15.115 c 43 84 99 120 147 161 173 183 166 187 a 7 1686 48.062 37.817 12.521 14407 43.991 42.971 1.258 b 11.830 -4.851 11.505 15.574 67.688 20.009 5.129 -7.014 8.766 1.024 c 59 117 133 153 142 179 183 243 252 225 a 8 1795 46.357 34.049 14.614 14433 44.757 43.282 1.829 b 14.234 6.729 13.626 19.758 7.065 16.373 21.204 11.431 18.082 15.739 c 57 103 137 183 168 202 205 240 237 263 a 9 1483 50.778 46.558 5.526 14363 44.923 44.479 0.554 b 40.845 .431 30.046 -7.049 -1.607 -2.724 -9.486 -.902 20.019 27.538 c 46 88 119 135 154 161 193 192 203 192 a 10 1332 38.220 34.823 3.918 14506 43.988 43.550 0.541 b 33.078 34.752 9.425 25.010 -2.922 -14.118 14.600 12.844 1.512 -8.735 c 51 77 93 118 123 154 166 172 181 197 a 11 1477 45.163 38.543 8.015 14361 45.415 44.577 1.051 b 9.914 8.880 2.467 18.150 5.381 11.874 4.743 29.698 14.834 -17.211 c 65 82 111 133 115 159 174 202 220 216 a 12 1654 45.251 30.457 17.159 14409 44.363 42.437 2.373 b 28.681 6.430 -40.284 36.914 30.008 47.293 29.607 9.512 -1.455 21.488 c 53 109 116 142 154 177 196 229 235 243 a 13 1422 51.281 36.632 18.036 14302 44.261 42.742 1.872 b 23.460 4.559 43.871 13.497 28.091 35.298 -1.493 17.764 17.226 19.655 c 38 73 101 123 153 153 169 192 224 196 a 14 1668 51.749 49.607 2.882 14441 44.642 44.527 0.145 b 5.096 37.325 5.515 16.257 21.437 -7.901 14.955 4.960 -15.485 -2.206 c 52 115 121 161 172 177 201 205 223 241 a 15 1369 46.667 38.742 9.674 14263 44.480 43.770 0.882 b -22.140 29.901 48.902 -7.002 26.620 .483 8.071 13.097 11.535 -5.181 c 56 92 105 115 116 138 151 177 196 223 a 16 1257 33.100 23.275 10.645 14291 44.849 43.826 1.273 b -14.293 -.418 44.139 -1.828 30.203 21.053 5.098 22.834 -4.129 -.948 c 34 60 57 120 133 127 159 188 199 180 a 17 1370 42.421 39.259 3.794 14264 44.877 44.550 0.409 b 30.851 -17.854 11.508 12.578 -18.416 2.277 20.934 12.146 -.761 .881 c 33 72 106 131 134 151 157 189 197 200 a 18 1269 45.248 26.342 21.465 14302 43.295 41.438 2.264 b 23.189 2.571 31.364 37.883 60.830 25.784 13.345 33.459 -3.441 5.251 c 45 101 97 111 127 143 144 164 168 169 a 19 1644 43.200 43.114 0.105 14365 45.169 45.113 0.071 b 23.714 13.389 29.752 38.784 -16.069 6.952 10.177 9.560 -19.196 -16.013 c 56 87 132 138 168 179 183 215 236 250 a 20 1615 54.027 42.202 15.317 14366 45.818 44.496 1.661 b 20.068 20.724 30.458 21.475 33.352 28.157 5.646 16.916 9.398 8.795 c 52 87 131 149 156 181 195 200 230 234 ------------------------------------------------------------- overall CC1/2ano: -25.548 nref= 2668 (but the overall value is meaningless!) <CC1/2ano> : -21.370 (frequency-weighted average of CC1/2ano in resolution shells) CC1/2ano in resolution shells: -75.0 -1.7 -30.1 -41.3 -38.8 -20.1 -18.2 -8.6 -17.7 -9.6 frequency i.e. number of unique Friedel pairs: 73 140 168 241 262 320 319 351 397 397 headings for lines starting with d,e,f: d: <CC1/2ano> of each dataset: d: reflections of this dataset only reflections of all datasets d: set nref with without delta nref with without delta e: delta-CC1/2ano in resolution shells f: # reflections for delta-CC1/2ano d 1 190 -30.590 -20.348 -10.922 2233 -17.969 -17.817 -0.157 e -42.962 -6.341 4.502 -41.515 -76.721 -36.825 43.947 19.535 21.803 -23.853 f 11 12 9 14 17 28 26 22 26 25 d 2 234 -16.480 -16.515 0.036 2336 -26.125 -26.132 0.008 e -3.116 -2.078 -3.249 -23.820 6.424 9.198 -3.791 1.123 6.966 -4.309 f 5 12 16 12 35 30 40 24 29 31 d 3 343 -40.435 -31.820 -9.887 2245 -22.659 -19.640 -3.160 e -1.713 12.234 -11.942 -63.179 5.125 -4.423 15.291 -30.082 .859 3.714 f 7 16 27 33 42 45 38 54 43 38 d 4 250 -32.340 -29.274 -3.387 2385 -19.104 -18.426 -0.702 e 22.408 37.540 12.328 -13.456 -2.729 1.911 -36.173 3.026 -6.757 12.916 f 3 13 12 25 39 42 41 30 23 22 d 5 179 -22.995 -24.947 2.070 2417 -26.768 -27.230 0.499 e 36.142 13.527 -10.456 -7.251 13.331 -.923 27.239 -4.208 -9.635 -10.625 f 5 4 7 6 19 22 29 30 23 34 d 6 188 -10.090 -13.603 3.562 2412 -21.564 -21.759 0.205 e -43.278 1.987 48.529 -2.536 35.294 -14.994 21.066 -6.607 -.446 -8.161 f 5 9 8 10 21 28 25 30 27 25 d 7 292 -34.206 -35.887 1.915 2316 -26.435 -26.723 0.309 e -19.814 34.972 .715 48.559 -11.683 -4.391 -14.145 5.337 -3.722 2.527 f 12 15 16 23 31 43 28 46 45 33 d 8 305 -6.575 -7.745 1.176 2353 -19.132 -19.241 0.113 e -7.804 -5.876 -10.001 4.878 -37.785 8.976 14.698 -4.716 9.080 11.786 f 11 17 17 26 28 45 34 38 42 47 d 9 283 -35.313 -32.348 -3.347 2398 -20.019 -19.463 -0.578 e -90.495 31.505 6.592 -45.999 -7.955 -11.621 6.013 -6.399 18.725 2.816 f 7 16 14 23 31 32 39 37 52 32 d 10 228 -31.230 -36.231 5.639 2184 -20.991 -21.921 0.975 e -10.975 26.119 -31.629 9.944 10.170 24.916 -19.125 25.874 -12.123 -.504 f 6 23 12 16 17 32 31 33 27 31 d 11 204 -25.646 -21.039 -4.870 2344 -26.292 -25.997 -0.317 e 8.152 1.052 -8.483 4.821 5.281 -.012 -33.636 6.064 -12.304 -7.890 f 13 11 16 14 19 25 24 28 26 28 d 12 314 -20.795 -27.734 7.363 2299 -21.005 -22.984 2.080 e -53.587 19.115 79.436 41.903 -3.288 4.497 22.135 12.845 -21.183 -1.646 f 4 15 16 27 34 37 34 46 54 47 d 13 233 -6.551 -6.622 0.071 2411 -21.747 -22.084 0.354 e 7.268 -17.245 31.471 30.207 .209 7.647 5.126 -61.071 16.081 14.792 f 4 11 15 20 20 34 38 34 27 30 d 14 321 -39.745 -43.950 5.095 2335 -21.004 -21.556 0.578 e 6.661 20.182 -13.578 -14.964 .663 20.951 -5.357 11.106 6.979 4.542 f 13 26 24 29 37 30 31 33 48 50 d 15 211 -25.574 -27.428 1.993 2451 -21.107 -22.106 1.049 e 9.034 24.046 51.709 10.903 25.126 -31.105 -39.654 -43.178 6.990 15.268 f 9 18 23 17 21 28 15 22 28 30 d 16 164 -28.372 -24.930 -3.704 2381 -18.839 -18.689 -0.156 e 20.336 23.610 -96.956 -10.020 7.356 1.404 14.821 -20.025 -4.886 -17.264 f 2 6 2 9 17 20 33 24 20 31 d 17 218 -13.441 -3.184 -10.302 2460 -20.794 -19.807 -1.029 e -13.966 -12.793 5.985 10.830 -5.428 -23.834 17.910 -9.981 -25.620 -22.550 f 2 13 11 19 16 22 26 38 34 37 d 18 136 -32.513 -31.586 -1.034 2379 -21.312 -21.014 -0.312 e 55.859 17.851 -8.273 -57.797 -68.191 27.216 15.219 15.258 -14.424 -10.376 f 4 14 12 16 14 19 18 13 16 10 d 19 295 -26.211 -37.786 12.848 2289 -19.913 -21.583 1.745 e -3.719 -15.248 39.777 -13.377 11.415 8.559 -15.482 8.791 25.058 49.701 f 8 13 17 33 30 35 40 38 38 43 d 20 309 -33.133 -33.400 0.300 2351 -21.683 -22.124 0.463 e -30.012 13.423 -2.376 15.330 -16.239 -2.850 -9.170 5.549 10.005 1.910 f 9 9 15 32 33 36 40 48 48 39 best delta-CC1/2_only= 26.22747 median of delta-cc1/2 ("only" i.e. 6th col of "a" lines) = 10.93907 noise= (MAD, median absolute deviation) from this median = 5.816339 median of delta-cc1/2 ("all" i.e. 10th col of "a" lines) = 1.174427 noise= (MAD, median absolute deviation) from this median = 0.6438574 median of delta-cc1/2-ano ("only" i.e. 6th col of "d" lines) = 0.1854570 noise= (MAD, median absolute deviation) from this median = 3.552712 median of delta-cc1/2-ano ("all" i.e. 10th col of "d" lines) = 0.1588948 noise= (MAD, median absolute deviation) from this median = 0.4448406 Wrote a commented XSCALE.INP.rename_me that is sorted on delta-CC1/2 "only" You may edit that file, or e.g. add lines after each INPUT_FILE line with sed '/INPUT_FILE/a INCLUDE_RESOLUTION_RANGE=99 3' normal termination
and the resulting file XSCALE.INP.rename_me is:
SPACE_GROUP_NUMBER= 19 UNIT_CELL_CONSTANTS= 38.30 79.10 79.10 90.000 90.000 90.000 OUTPUT_FILE= temp.ahkl PRINT_CORRELATIONS= FALSE SAVE_CORRECTION_IMAGES= FALSE FRIEDEL'S_LAW= FALSE ! median of delta-cc1/2 "only" values= 10.939 ! noise (MAD) of these values= 5.816 ! median of delta-cc1/2 "all" values= 1.174 ! noise (MAD) of these values= 0.644 ! median of delta-cc1/2-ano "only" values= 0.185 ! noise (MAD) of these values= 3.553 ! median of delta-cc1/2-ano "all" values= 0.159 ! noise (MAD) of these values= 0.445 ! input files sorted by deltacc12_only (highest first): ! deltacc12 only / all: 26.2275 1.9103 deltacc12-ano only /all: -10.9223 -0.1569 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0001/xds/XDS_ASCII.HKL ! deltacc12 only / all: 21.4650 2.2639 deltacc12-ano only /all: -1.0339 -0.3117 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0018/xds/XDS_ASCII.HKL ! deltacc12 only / all: 18.0364 1.8724 deltacc12-ano only /all: 0.0708 0.3545 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0013/xds/XDS_ASCII.HKL ! deltacc12 only / all: 17.7684 1.9166 deltacc12-ano only /all: -3.3870 -0.7021 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0004/xds/XDS_ASCII.HKL ! deltacc12 only / all: 17.1589 2.3726 deltacc12-ano only /all: 7.3634 2.0795 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0012/xds/XDS_ASCII.HKL ! deltacc12 only / all: 15.3174 1.6608 deltacc12-ano only /all: 0.3001 0.4632 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0020/xds/XDS_ASCII.HKL ! deltacc12 only / all: 14.6140 1.8288 deltacc12-ano only /all: 1.1762 0.1129 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0008/xds/XDS_ASCII.HKL ! deltacc12 only / all: 12.5208 1.2582 deltacc12-ano only /all: 1.9155 0.3089 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0007/xds/XDS_ASCII.HKL ! deltacc12 only / all: 11.5122 1.0907 deltacc12-ano only /all: 0.0360 0.0079 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0002/xds/XDS_ASCII.HKL ! deltacc12 only / all: 11.2330 1.6064 deltacc12-ano only /all: -9.8867 -3.1596 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0003/xds/XDS_ASCII.HKL ! deltacc12 only / all: 10.6452 1.2727 deltacc12-ano only /all: -3.7041 -0.1557 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0016/xds/XDS_ASCII.HKL ! deltacc12 only / all: 9.6737 0.8816 deltacc12-ano only /all: 1.9932 1.0487 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0015/xds/XDS_ASCII.HKL ! deltacc12 only / all: 8.6341 0.7283 deltacc12-ano only /all: 3.5620 0.2049 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0006/xds/XDS_ASCII.HKL ! deltacc12 only / all: 8.0151 1.0507 deltacc12-ano only /all: -4.8702 -0.3171 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0011/xds/XDS_ASCII.HKL ! deltacc12 only / all: 5.5262 0.5542 deltacc12-ano only /all: -3.3475 -0.5778 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0009/xds/XDS_ASCII.HKL ! deltacc12 only / all: 3.9182 0.5411 deltacc12-ano only /all: 5.6393 0.9750 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0010/xds/XDS_ASCII.HKL ! deltacc12 only / all: 3.7935 0.4089 deltacc12-ano only /all: -10.3019 -1.0286 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0017/xds/XDS_ASCII.HKL ! deltacc12 only / all: 3.0997 0.3399 deltacc12-ano only /all: 2.0699 0.4987 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0005/xds/XDS_ASCII.HKL ! deltacc12 only / all: 2.8817 0.1446 deltacc12-ano only /all: 5.0954 0.5780 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0014/xds/XDS_ASCII.HKL ! deltacc12 only / all: 0.1054 0.0707 deltacc12-ano only /all: 12.8477 1.7449 INPUT_FILE=/scratch/data/JamesHolton_microfocus/2019/wedge0019/xds/XDS_ASCII.HKL
Correlation against a reference data set (-r <reference> option)
The correlation of the experimental data set against the user-supplied reference data is shown in the lines starting with r. To prepare a reference data set if the refinement was done with phenix.refine, one could use e.g.
mtz2various hklin 2bn3_refine_001.mtz hklout temp.hkl <<eof OUTPUT USER * LABIN FC=F-model PHIC=PHIF-model END eof
- the column corresponding to PHIC will not be used by xdscc12. Alternatively,
sftools read mymodel_001.mtz write temp.hkl format(3i5,f10.3) col F-model y quit
Reference data with anomalous signal (additional -s option)
The correlation of the anomalous difference of the experimental data set against the anomalous signal of the user-supplied reference data is shown in the lines starting with s.
A simple way to obtain Fcalc(+) and Fcalc(-) is to run phenix.refine
with options (in case of S as anomalous scatterer)
refinement.input.xray_data.labels="F(+),SIGF(+),F(-),SIGF(-),merged" refinement.refine.anomalous_scatterers.group.selection="element S" strategy=individual_sites+individual_adp+group_anomalous+occupancies
and then
sftools <<eof read mymodel_001.mtz write anom-reference.hkl format(3i5,2f10.3) col "F-model(+)" "F-model(-)" y quit eof
in which case sftools
outputs only the acentric reflections - only those have anomalous differences. XDSCC12
then has to be run with the -s -r anom-reference.hkl
option.
See also
A complete description of how to process serial crystallography data with XDS/XSCALE is given in SSX.
xscale_isocluster is a program that implements the method of Brehm and Diederichs (2014) and theory of Diederichs (2017). It serves to identify groups of related datasets in a reflection file produced by XSCALE, and should be used before XDSCC12.
To remove bad frames from a XDS_ASCII.HKL file, you can re-INTEGRATE or just re-CORRECT with the keyword EXCLUDE_DATA_RANGE in XDS.INP.