@@ Line 322: / Line 322: @@
 Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.
+=== First try ===
 Let's look at the XSCALE statistics for "firstparts":
@@ Line 359: / Line 360: @@
 First, the shelxc output which shows that these data are quite good:
 [[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]]
-And then 100 trials of shelxd, trying to find 3 Se atoms at 30-3.3 resolution (I also tried 3.0 3.1 3.2 3.4 3.5 but 3.3 was best).
+And then we show the result of 100 trials at substructure solution of shelxd, trying to find 3 Se atoms at 30 - 3.3Å resolution (I also tried 3.0 3.1 3.2 3.4 3.5 Å but 3.3 Å was best).
 [[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]]
 This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !).
+=== Second try: correcting radiation damage at the level of individual reflections ===
+Since we noted significant radiation damage we could try to correct that. All we have to do is ask XSCALE to do it:
+<pre>
+UNIT_CELL_CONSTANTS=103.316   103.316   131.456  90.000  90.000  90.000
+SPACE_GROUP_NUMBER=96
+OUTPUT_FILE=temp.ahkl
+INPUT_FILE=../e1_1-372/XDS_ASCII.HKL
+CRYSTAL_NAME=a
+INPUT_FILE=../e2_1-369/XDS_ASCII.HKL
+CRYSTAL_NAME=a
+</pre>
+As a result we obtain:
+<pre>
+ ******************************************************************************
+          RESULTS FROM ZERO-DOSE EXTRAPOLATION OF REFLECTION INTENSITIES
+                       for reference on this subject see:
+ K. Diederichs, S. McSweeney & R.B.G. Ravelli, Acta Cryst. D59, 903-909(2003).
+ "Zero-dose extrapolation as part of macromolecular synchrotron data reduction"
+ ******************************************************************************
+ Radiation damage can lead to localized modifications of the structure.
+ To correct for this effect, XSCALE modifies the intensity measurements
+ I(h,i) by individual correction factors,
+                      exp{-b(h)*dose(h,i)}
+ where h,i denotes the i-th observation with unique reflection indices
+ h, and dose(h,i) the X-ray dose accumulated by the crystal when the
+ reflection was recorded. Assuming a constant dose for each image
+ (dose_rate), the accumulated dose when recording image_number(i), on
+ which I(h,i) was observed, is then
+ dose(h,i) = starting_dose + dose_rate * (image_number(i)-first_image)
+ The decay factor b(h) is determined from the assumption that symmetry
+ related reflections in a data set taken from the same crystal should
+ have the same intensity after correction. Moreover, b(h) is assumed to
+ be the same for Friedel-pairs and independent of the X-ray wavelength.
+ To avoid overfitting the data, XSCALE starts with the hypothesis that
+ b(h)=0 and rejects this assumption if its probability is below 10.0%.
+ CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS
+ -----------------------------------------------------------
+ First  INPUT_FILE= ../e2_1-369/XDS_ASCII.HKL
+      CRYSTAL_NAME= a
+ Second INPUT_FILE= ../e1_1-372/XDS_ASCII.HKL
+      CRYSTAL_NAME= a
+ RESOLUTION    NUMBER    CORRELATION
+   LIMIT      OF PAIRS      FACTOR
+.40         210        0.955
+.64         441        0.955
+.43         587        0.940
+.70         692        0.969
+.20         750        0.949
+.84         836        0.920
+.55         809        0.942
+.32         775        0.925
+.13         663        0.888
+.97         557        0.837
+.83         375        0.681
+.71         302        0.812
+.61         212        0.625
+.51         163        0.508
+.43          95        0.291
+.35         139        0.722
+.28         110        0.688
+.21          91        0.734
+.16          88        0.561
+.10          54        0.126
+    total        7949        0.788
+           X-RAY DOSE PARAMETERS USED FOR EACH INPUT DATA SET
+           --------------------------------------------------
+ CRYSTAL_NAME= a
+        STARTING_DOSE             DOSE_RATE       NAME OF INPUT FILE
+     initial    refined      initial    refined
+.000E+00   8.557E+00   1.000E+00   1.000E+00  ../e1_1-372/XDS_ASCII.HKL
+.000E+00   0.000E+00   1.000E+00   1.024E+00  ../e2_1-369/XDS_ASCII.HKL
+           STATISTICS OF 0-DOSE CORRECTED DATA FROM EACH CRYSTAL
+           -----------------------------------------------------
+ NUNIQUE = Number of unique reflections with enough symmetry-
+           related observations to determine a decay factor b(h)
+ N0-DOSE = Number of 0-dose extrapolated unique reflections
+ NERROR  = Number of unique extrapolated reflections expected
+           to be overfitted. A large ratio of N0-DOSE/NERROR
+           justifies the data correction as carried out here.
+ S_corr  = mean value of Sigma(I) for 0-dose extrapolated data
+ S_norm  = mean value of Sigma(I) for the same data but
+           without 0-dose extrapolation.
+ NFREE   = degrees of freedom for calculating S_corr
+ CRYSTAL_NAME= a
+ RESOLUTION  NUNIQUE  N0-DOSE  N0-DOSE/   S_corr/    NFREE
+   LIMIT                        NERROR    S_norm
+.40       496     378      68.0       0.543     3180
+.64       908     703      78.9       0.554     6245
+.43      1140     894      77.0       0.574     8064
+.70      1351    1040      77.4       0.599     9671
+.20      1518    1133      69.9       0.620    10585
+.84      1665    1187      73.9       0.630    11129
+.55      1787    1220      65.1       0.671    11917
+.32      1941    1289      58.1       0.690    12728
+.13      2042    1172      49.8       0.717    11877
+.97      2182    1103      48.1       0.750    11498
+.83      2281     911      40.1       0.798     9662
+.71      2352     812      34.2       0.825     8611
+.61      2467     702      34.1       0.848     7383
+.51      2566     627      31.5       0.875     6595
+.43      2624     499      31.2       0.895     5295
+.35      2709     629      31.6       0.888     6240
+.28      2821     603      28.5       0.893     6147
+.21      2880     560      32.4       0.905     5758
+.16      2959     448      30.3       0.907     4394
+.10      2860     413      29.9       0.924     3745
+    total     41549   16323      46.8       0.739   160724
+ ******************************************************************************
+              SCALING FACTORS FOR Sigma(I) AS FUNCTION OF RESOLUTION
+ ******************************************************************************
+ SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e1_1-372/XDS_ASCII.HKL
+                                   RESOLUTION (ANGSTROM)
+.33  6.12  4.76  4.03  3.56  3.23  2.97  2.76  2.60  2.46  2.34  2.23  2.14
+ FACTOR   0.94  0.96  0.88  0.93  0.99  0.98  0.99  0.99  0.99  0.98  1.10  1.00  0.99
+ SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e2_1-369/XDS_ASCII.HKL
+                                   RESOLUTION (ANGSTROM)
+.32  6.11  4.76  4.03  3.56  3.22  2.97  2.76  2.60  2.46  2.34  2.23  2.14
+ FACTOR   0.96  0.98  0.89  0.94  1.01  1.01  1.02  1.01  1.00  0.99  1.11  1.02  0.98
+ ******************************************************************************
+  STATISTICS OF SCALED OUTPUT DATA SET : temp.ahkl
+  FILE TYPE:         XDS_ASCII      MERGE=FALSE          FRIEDEL'S_LAW=FALSE
+OUT OF    607179 REFLECTIONS REJECTED
+REFLECTIONS ON OUTPUT FILE
+ ******************************************************************************
+ DEFINITIONS:
+ R-FACTOR
+ observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))
+ expected = expected R-FACTOR derived from Sigma(I)
+ COMPARED = number of reflections used for calculating R-FACTOR
+ I/SIGMA  = mean of intensity/Sigma(I) of unique reflections
+            (after merging symmetry-related observations)
+ Sigma(I) = standard deviation of reflection intensity I
+            estimated from sample statistics
+ R-meas   = redundancy independent R-factor (intensities)
+ Rmrgd-F  = quality of amplitudes (F) in the scaled data set
+            For definition of R-meas and Rmrgd-F see
+            Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
+ Anomal   = mean correlation factor between two random subsets
+  Corr      of anomalous intensity differences
+ SigAno   = mean anomalous difference in units of its estimated
+            standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-)
+            are structure factor estimates obtained from the
+            merged intensity observations in each parity class.
+  Nano    = Number of unique reflections used to calculate
+            Anomal_Corr & SigAno. At least two observations
+            for each (+ and -) parity are required.
+       NOTE:      Friedel pairs are treated as different reflections.
+ SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
+ RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
+   LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr
+.40        6095     844       883       95.6%       2.0%      2.6%     6084   73.41     2.1%     0.9%    87%   2.706     313
+.64       12006    1611      1621       99.4%       2.0%      2.8%    12004   68.81     2.1%     1.0%    84%   2.555     684
+.43       15339    2065      2086       99.0%       2.2%      2.8%    15338   63.28     2.4%     1.2%    82%   2.409     908
+.70       18697    2486      2498       99.5%       1.9%      2.6%    18694   70.84     2.1%     1.0%    75%   1.855    1120
+.20       21080    2796      2821       99.1%       2.0%      2.7%    21078   66.87     2.1%     1.1%    67%   1.727    1270
+.84       23300    3094      3117       99.3%       2.5%      3.0%    23297   58.10     2.7%     1.5%    64%   1.551    1420
+.55       25676    3344      3366       99.3%       3.1%      3.6%    25676   48.56     3.4%     1.9%    50%   1.326    1548
+.32       28013    3633      3653       99.5%       3.9%      4.3%    28011   41.76     4.1%     2.8%    37%   1.244    1687
+.13       30254    3841      3848       99.8%       5.7%      6.0%    30252   32.18     6.1%     4.1%    35%   1.125    1796
+.97       32595    4114      4118       99.9%       8.8%      9.1%    32594   23.53     9.4%     6.8%    26%   1.038    1925
+.83       34368    4313      4320       99.8%      12.8%     13.3%    34366   17.65    13.6%     9.5%    21%   0.989    2030
+.71       35627    4472      4478       99.9%      16.9%     17.4%    35625   14.15    18.1%    12.2%    18%   0.965    2108
+.61       37300    4704      4710       99.9%      25.8%     26.4%    37297    9.70    27.6%    19.3%    16%   0.930    2223
+.51       38975    4890      4896       99.9%      33.8%     34.9%    38975    7.68    36.1%    24.1%    14%   0.888    2315
+.43       39971    5019      5027       99.8%      49.1%     50.8%    39967    5.47    52.5%    37.2%     8%   0.810    2380
+.35       39968    5179      5222       99.2%      67.9%     67.5%    39960    4.07    72.7%    50.4%    25%   0.927    2445
+.28       42067    5388      5423       99.4%      89.9%     94.3%    42063    3.03    96.2%    63.5%    16%   0.796    2548
+.21       43011    5538      5541       99.9%      82.3%     83.3%    43010    3.16    88.1%    57.9%    14%   0.871    2644
+.16       42577    5697      5703       99.9%     108.5%    112.2%    42574    2.37   116.6%    83.1%     3%   0.760    2720
+.10       38988    5633      5912       95.3%     142.1%    144.2%    38936    1.67   153.5%   119.2%     6%   0.772    2638
+    total      605907   78661     79243       99.3%       5.5%      6.1%   605801   21.72     5.9%    11.3%    27%   1.095   36722
+</pre>
+We not that the "CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS" are really high which confirms the hypothesis that this is a valid procedure to perform.
+Comparison of the last table with that of the previous paragraph, i.e. without zero-dose extrapolation, shows that the I/sigma, the anomalous correlation coefficients and the SigAno are significantly higher. Does this translate into better structure solution? It does:
+[[File:1y13-raddam-ccall-ccweak-raddam.png]]
+[[File:1y13-raddam-site-occ-raddam.png]]
+[[File:1y13-raddam-contrast-raddam.png]]

1Y13: Difference between revisions

1Y13 (view source)

Revision as of 18:31, 17 March 2011