2,684
edits
Line 322: | Line 322: | ||
Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only. | Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only. | ||
=== First try === | |||
Let's look at the XSCALE statistics for "firstparts": | Let's look at the XSCALE statistics for "firstparts": | ||
Line 359: | Line 360: | ||
First, the shelxc output which shows that these data are quite good: | First, the shelxc output which shows that these data are quite good: | ||
[[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]] | [[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]] | ||
And then 100 trials of shelxd, trying to find 3 Se atoms at 30-3. | And then we show the result of 100 trials at substructure solution of shelxd, trying to find 3 Se atoms at 30 - 3.3Å resolution (I also tried 3.0 3.1 3.2 3.4 3.5 Å but 3.3 Å was best). | ||
[[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]] | [[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]] | ||
This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !). | This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !). | ||
=== Second try: correcting radiation damage at the level of individual reflections === | |||
Since we noted significant radiation damage we could try to correct that. All we have to do is ask XSCALE to do it: | |||
<pre> | |||
UNIT_CELL_CONSTANTS=103.316 103.316 131.456 90.000 90.000 90.000 | |||
SPACE_GROUP_NUMBER=96 | |||
OUTPUT_FILE=temp.ahkl | |||
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL | |||
CRYSTAL_NAME=a | |||
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL | |||
CRYSTAL_NAME=a | |||
</pre> | |||
As a result we obtain: | |||
<pre> | |||
****************************************************************************** | |||
RESULTS FROM ZERO-DOSE EXTRAPOLATION OF REFLECTION INTENSITIES | |||
for reference on this subject see: | |||
K. Diederichs, S. McSweeney & R.B.G. Ravelli, Acta Cryst. D59, 903-909(2003). | |||
"Zero-dose extrapolation as part of macromolecular synchrotron data reduction" | |||
****************************************************************************** | |||
Radiation damage can lead to localized modifications of the structure. | |||
To correct for this effect, XSCALE modifies the intensity measurements | |||
I(h,i) by individual correction factors, | |||
exp{-b(h)*dose(h,i)} | |||
where h,i denotes the i-th observation with unique reflection indices | |||
h, and dose(h,i) the X-ray dose accumulated by the crystal when the | |||
reflection was recorded. Assuming a constant dose for each image | |||
(dose_rate), the accumulated dose when recording image_number(i), on | |||
which I(h,i) was observed, is then | |||
dose(h,i) = starting_dose + dose_rate * (image_number(i)-first_image) | |||
The decay factor b(h) is determined from the assumption that symmetry | |||
related reflections in a data set taken from the same crystal should | |||
have the same intensity after correction. Moreover, b(h) is assumed to | |||
be the same for Friedel-pairs and independent of the X-ray wavelength. | |||
To avoid overfitting the data, XSCALE starts with the hypothesis that | |||
b(h)=0 and rejects this assumption if its probability is below 10.0%. | |||
CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS | |||
----------------------------------------------------------- | |||
First INPUT_FILE= ../e2_1-369/XDS_ASCII.HKL | |||
CRYSTAL_NAME= a | |||
Second INPUT_FILE= ../e1_1-372/XDS_ASCII.HKL | |||
CRYSTAL_NAME= a | |||
RESOLUTION NUMBER CORRELATION | |||
LIMIT OF PAIRS FACTOR | |||
9.40 210 0.955 | |||
6.64 441 0.955 | |||
5.43 587 0.940 | |||
4.70 692 0.969 | |||
4.20 750 0.949 | |||
3.84 836 0.920 | |||
3.55 809 0.942 | |||
3.32 775 0.925 | |||
3.13 663 0.888 | |||
2.97 557 0.837 | |||
2.83 375 0.681 | |||
2.71 302 0.812 | |||
2.61 212 0.625 | |||
2.51 163 0.508 | |||
2.43 95 0.291 | |||
2.35 139 0.722 | |||
2.28 110 0.688 | |||
2.21 91 0.734 | |||
2.16 88 0.561 | |||
2.10 54 0.126 | |||
total 7949 0.788 | |||
X-RAY DOSE PARAMETERS USED FOR EACH INPUT DATA SET | |||
-------------------------------------------------- | |||
CRYSTAL_NAME= a | |||
STARTING_DOSE DOSE_RATE NAME OF INPUT FILE | |||
initial refined initial refined | |||
0.000E+00 8.557E+00 1.000E+00 1.000E+00 ../e1_1-372/XDS_ASCII.HKL | |||
0.000E+00 0.000E+00 1.000E+00 1.024E+00 ../e2_1-369/XDS_ASCII.HKL | |||
STATISTICS OF 0-DOSE CORRECTED DATA FROM EACH CRYSTAL | |||
----------------------------------------------------- | |||
NUNIQUE = Number of unique reflections with enough symmetry- | |||
related observations to determine a decay factor b(h) | |||
N0-DOSE = Number of 0-dose extrapolated unique reflections | |||
NERROR = Number of unique extrapolated reflections expected | |||
to be overfitted. A large ratio of N0-DOSE/NERROR | |||
justifies the data correction as carried out here. | |||
S_corr = mean value of Sigma(I) for 0-dose extrapolated data | |||
S_norm = mean value of Sigma(I) for the same data but | |||
without 0-dose extrapolation. | |||
NFREE = degrees of freedom for calculating S_corr | |||
CRYSTAL_NAME= a | |||
RESOLUTION NUNIQUE N0-DOSE N0-DOSE/ S_corr/ NFREE | |||
LIMIT NERROR S_norm | |||
9.40 496 378 68.0 0.543 3180 | |||
6.64 908 703 78.9 0.554 6245 | |||
5.43 1140 894 77.0 0.574 8064 | |||
4.70 1351 1040 77.4 0.599 9671 | |||
4.20 1518 1133 69.9 0.620 10585 | |||
3.84 1665 1187 73.9 0.630 11129 | |||
3.55 1787 1220 65.1 0.671 11917 | |||
3.32 1941 1289 58.1 0.690 12728 | |||
3.13 2042 1172 49.8 0.717 11877 | |||
2.97 2182 1103 48.1 0.750 11498 | |||
2.83 2281 911 40.1 0.798 9662 | |||
2.71 2352 812 34.2 0.825 8611 | |||
2.61 2467 702 34.1 0.848 7383 | |||
2.51 2566 627 31.5 0.875 6595 | |||
2.43 2624 499 31.2 0.895 5295 | |||
2.35 2709 629 31.6 0.888 6240 | |||
2.28 2821 603 28.5 0.893 6147 | |||
2.21 2880 560 32.4 0.905 5758 | |||
2.16 2959 448 30.3 0.907 4394 | |||
2.10 2860 413 29.9 0.924 3745 | |||
total 41549 16323 46.8 0.739 160724 | |||
****************************************************************************** | |||
SCALING FACTORS FOR Sigma(I) AS FUNCTION OF RESOLUTION | |||
****************************************************************************** | |||
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e1_1-372/XDS_ASCII.HKL | |||
RESOLUTION (ANGSTROM) | |||
10.33 6.12 4.76 4.03 3.56 3.23 2.97 2.76 2.60 2.46 2.34 2.23 2.14 | |||
FACTOR 0.94 0.96 0.88 0.93 0.99 0.98 0.99 0.99 0.99 0.98 1.10 1.00 0.99 | |||
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e2_1-369/XDS_ASCII.HKL | |||
RESOLUTION (ANGSTROM) | |||
10.32 6.11 4.76 4.03 3.56 3.22 2.97 2.76 2.60 2.46 2.34 2.23 2.14 | |||
FACTOR 0.96 0.98 0.89 0.94 1.01 1.01 1.02 1.01 1.00 0.99 1.11 1.02 0.98 | |||
****************************************************************************** | |||
STATISTICS OF SCALED OUTPUT DATA SET : temp.ahkl | |||
FILE TYPE: XDS_ASCII MERGE=FALSE FRIEDEL'S_LAW=FALSE | |||
1270 OUT OF 607179 REFLECTIONS REJECTED | |||
605909 REFLECTIONS ON OUTPUT FILE | |||
****************************************************************************** | |||
DEFINITIONS: | |||
R-FACTOR | |||
observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i))) | |||
expected = expected R-FACTOR derived from Sigma(I) | |||
COMPARED = number of reflections used for calculating R-FACTOR | |||
I/SIGMA = mean of intensity/Sigma(I) of unique reflections | |||
(after merging symmetry-related observations) | |||
Sigma(I) = standard deviation of reflection intensity I | |||
estimated from sample statistics | |||
R-meas = redundancy independent R-factor (intensities) | |||
Rmrgd-F = quality of amplitudes (F) in the scaled data set | |||
For definition of R-meas and Rmrgd-F see | |||
Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275. | |||
Anomal = mean correlation factor between two random subsets | |||
Corr of anomalous intensity differences | |||
SigAno = mean anomalous difference in units of its estimated | |||
standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-) | |||
are structure factor estimates obtained from the | |||
merged intensity observations in each parity class. | |||
Nano = Number of unique reflections used to calculate | |||
Anomal_Corr & SigAno. At least two observations | |||
for each (+ and -) parity are required. | |||
NOTE: Friedel pairs are treated as different reflections. | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
9.40 6095 844 883 95.6% 2.0% 2.6% 6084 73.41 2.1% 0.9% 87% 2.706 313 | |||
6.64 12006 1611 1621 99.4% 2.0% 2.8% 12004 68.81 2.1% 1.0% 84% 2.555 684 | |||
5.43 15339 2065 2086 99.0% 2.2% 2.8% 15338 63.28 2.4% 1.2% 82% 2.409 908 | |||
4.70 18697 2486 2498 99.5% 1.9% 2.6% 18694 70.84 2.1% 1.0% 75% 1.855 1120 | |||
4.20 21080 2796 2821 99.1% 2.0% 2.7% 21078 66.87 2.1% 1.1% 67% 1.727 1270 | |||
3.84 23300 3094 3117 99.3% 2.5% 3.0% 23297 58.10 2.7% 1.5% 64% 1.551 1420 | |||
3.55 25676 3344 3366 99.3% 3.1% 3.6% 25676 48.56 3.4% 1.9% 50% 1.326 1548 | |||
3.32 28013 3633 3653 99.5% 3.9% 4.3% 28011 41.76 4.1% 2.8% 37% 1.244 1687 | |||
3.13 30254 3841 3848 99.8% 5.7% 6.0% 30252 32.18 6.1% 4.1% 35% 1.125 1796 | |||
2.97 32595 4114 4118 99.9% 8.8% 9.1% 32594 23.53 9.4% 6.8% 26% 1.038 1925 | |||
2.83 34368 4313 4320 99.8% 12.8% 13.3% 34366 17.65 13.6% 9.5% 21% 0.989 2030 | |||
2.71 35627 4472 4478 99.9% 16.9% 17.4% 35625 14.15 18.1% 12.2% 18% 0.965 2108 | |||
2.61 37300 4704 4710 99.9% 25.8% 26.4% 37297 9.70 27.6% 19.3% 16% 0.930 2223 | |||
2.51 38975 4890 4896 99.9% 33.8% 34.9% 38975 7.68 36.1% 24.1% 14% 0.888 2315 | |||
2.43 39971 5019 5027 99.8% 49.1% 50.8% 39967 5.47 52.5% 37.2% 8% 0.810 2380 | |||
2.35 39968 5179 5222 99.2% 67.9% 67.5% 39960 4.07 72.7% 50.4% 25% 0.927 2445 | |||
2.28 42067 5388 5423 99.4% 89.9% 94.3% 42063 3.03 96.2% 63.5% 16% 0.796 2548 | |||
2.21 43011 5538 5541 99.9% 82.3% 83.3% 43010 3.16 88.1% 57.9% 14% 0.871 2644 | |||
2.16 42577 5697 5703 99.9% 108.5% 112.2% 42574 2.37 116.6% 83.1% 3% 0.760 2720 | |||
2.10 38988 5633 5912 95.3% 142.1% 144.2% 38936 1.67 153.5% 119.2% 6% 0.772 2638 | |||
total 605907 78661 79243 99.3% 5.5% 6.1% 605801 21.72 5.9% 11.3% 27% 1.095 36722 | |||
</pre> | |||
We not that the "CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS" are really high which confirms the hypothesis that this is a valid procedure to perform. | |||
Comparison of the last table with that of the previous paragraph, i.e. without zero-dose extrapolation, shows that the I/sigma, the anomalous correlation coefficients and the SigAno are significantly higher. Does this translate into better structure solution? It does: | |||
[[File:1y13-raddam-ccall-ccweak-raddam.png]] | |||
[[File:1y13-raddam-site-occ-raddam.png]] | |||
[[File:1y13-raddam-contrast-raddam.png]] |