1Y13: Difference between revisions

12,132 bytes added ,  17 March 2011
Line 322: Line 322:
Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.
Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.


=== First try ===
Let's look at the XSCALE statistics for "firstparts":
Let's look at the XSCALE statistics for "firstparts":


Line 359: Line 360:
First, the shelxc output which shows that these data are quite good:
First, the shelxc output which shows that these data are quite good:
[[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]]
[[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]]
And then 100 trials of shelxd, trying to find 3 Se atoms at 30-3.3 resolution (I also tried 3.0 3.1 3.2 3.4 3.5 but 3.3 was best).
And then we show the result of 100 trials at substructure solution of shelxd, trying to find 3 Se atoms at 30 - 3.resolution (I also tried 3.0 3.1 3.2 3.4 3.5 Å but 3.3 Å was best).
[[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]]
[[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]]
This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !).
This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !).
=== Second try: correcting radiation damage at the level of individual reflections ===
Since we noted significant radiation damage we could try to correct that. All we have to do is ask XSCALE to do it:
<pre>
UNIT_CELL_CONSTANTS=103.316  103.316  131.456  90.000  90.000  90.000
SPACE_GROUP_NUMBER=96
OUTPUT_FILE=temp.ahkl
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL
CRYSTAL_NAME=a
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL
CRYSTAL_NAME=a
</pre>
As a result we obtain:
<pre>
******************************************************************************
          RESULTS FROM ZERO-DOSE EXTRAPOLATION OF REFLECTION INTENSITIES
                      for reference on this subject see:
K. Diederichs, S. McSweeney & R.B.G. Ravelli, Acta Cryst. D59, 903-909(2003).
"Zero-dose extrapolation as part of macromolecular synchrotron data reduction"
******************************************************************************
Radiation damage can lead to localized modifications of the structure.
To correct for this effect, XSCALE modifies the intensity measurements
I(h,i) by individual correction factors,
                      exp{-b(h)*dose(h,i)}
where h,i denotes the i-th observation with unique reflection indices
h, and dose(h,i) the X-ray dose accumulated by the crystal when the
reflection was recorded. Assuming a constant dose for each image
(dose_rate), the accumulated dose when recording image_number(i), on
which I(h,i) was observed, is then
dose(h,i) = starting_dose + dose_rate * (image_number(i)-first_image)
The decay factor b(h) is determined from the assumption that symmetry
related reflections in a data set taken from the same crystal should
have the same intensity after correction. Moreover, b(h) is assumed to
be the same for Friedel-pairs and independent of the X-ray wavelength.
To avoid overfitting the data, XSCALE starts with the hypothesis that
b(h)=0 and rejects this assumption if its probability is below 10.0%.
CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS
-----------------------------------------------------------
First  INPUT_FILE= ../e2_1-369/XDS_ASCII.HKL                       
      CRYSTAL_NAME= a                                               
Second INPUT_FILE= ../e1_1-372/XDS_ASCII.HKL                       
      CRYSTAL_NAME= a                                               
RESOLUTION    NUMBER    CORRELATION
  LIMIT      OF PAIRS      FACTOR
    9.40        210        0.955
    6.64        441        0.955
    5.43        587        0.940
    4.70        692        0.969
    4.20        750        0.949
    3.84        836        0.920
    3.55        809        0.942
    3.32        775        0.925
    3.13        663        0.888
    2.97        557        0.837
    2.83        375        0.681
    2.71        302        0.812
    2.61        212        0.625
    2.51        163        0.508
    2.43          95        0.291
    2.35        139        0.722
    2.28        110        0.688
    2.21          91        0.734
    2.16          88        0.561
    2.10          54        0.126
    total        7949        0.788
          X-RAY DOSE PARAMETERS USED FOR EACH INPUT DATA SET
          --------------------------------------------------
CRYSTAL_NAME= a                                               
        STARTING_DOSE            DOSE_RATE      NAME OF INPUT FILE
    initial    refined      initial    refined
  0.000E+00  8.557E+00  1.000E+00  1.000E+00  ../e1_1-372/XDS_ASCII.HKL                       
  0.000E+00  0.000E+00  1.000E+00  1.024E+00  ../e2_1-369/XDS_ASCII.HKL                       
          STATISTICS OF 0-DOSE CORRECTED DATA FROM EACH CRYSTAL
          -----------------------------------------------------
NUNIQUE = Number of unique reflections with enough symmetry-
          related observations to determine a decay factor b(h)
N0-DOSE = Number of 0-dose extrapolated unique reflections
NERROR  = Number of unique extrapolated reflections expected
          to be overfitted. A large ratio of N0-DOSE/NERROR
          justifies the data correction as carried out here.
S_corr  = mean value of Sigma(I) for 0-dose extrapolated data
S_norm  = mean value of Sigma(I) for the same data but
          without 0-dose extrapolation.
NFREE  = degrees of freedom for calculating S_corr
CRYSTAL_NAME= a                                               
RESOLUTION  NUNIQUE  N0-DOSE  N0-DOSE/  S_corr/    NFREE
  LIMIT                        NERROR    S_norm
    9.40      496    378      68.0      0.543    3180
    6.64      908    703      78.9      0.554    6245
    5.43      1140    894      77.0      0.574    8064
    4.70      1351    1040      77.4      0.599    9671
    4.20      1518    1133      69.9      0.620    10585
    3.84      1665    1187      73.9      0.630    11129
    3.55      1787    1220      65.1      0.671    11917
    3.32      1941    1289      58.1      0.690    12728
    3.13      2042    1172      49.8      0.717    11877
    2.97      2182    1103      48.1      0.750    11498
    2.83      2281    911      40.1      0.798    9662
    2.71      2352    812      34.2      0.825    8611
    2.61      2467    702      34.1      0.848    7383
    2.51      2566    627      31.5      0.875    6595
    2.43      2624    499      31.2      0.895    5295
    2.35      2709    629      31.6      0.888    6240
    2.28      2821    603      28.5      0.893    6147
    2.21      2880    560      32.4      0.905    5758
    2.16      2959    448      30.3      0.907    4394
    2.10      2860    413      29.9      0.924    3745
    total    41549  16323      46.8      0.739  160724
******************************************************************************
              SCALING FACTORS FOR Sigma(I) AS FUNCTION OF RESOLUTION
******************************************************************************
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e1_1-372/XDS_ASCII.HKL                       
                                  RESOLUTION (ANGSTROM) 
        10.33  6.12  4.76  4.03  3.56  3.23  2.97  2.76  2.60  2.46  2.34  2.23  2.14
FACTOR  0.94  0.96  0.88  0.93  0.99  0.98  0.99  0.99  0.99  0.98  1.10  1.00  0.99
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e2_1-369/XDS_ASCII.HKL                       
                                  RESOLUTION (ANGSTROM) 
        10.32  6.11  4.76  4.03  3.56  3.22  2.97  2.76  2.60  2.46  2.34  2.23  2.14
FACTOR  0.96  0.98  0.89  0.94  1.01  1.01  1.02  1.01  1.00  0.99  1.11  1.02  0.98
******************************************************************************
  STATISTICS OF SCALED OUTPUT DATA SET : temp.ahkl                                       
  FILE TYPE:        XDS_ASCII      MERGE=FALSE          FRIEDEL'S_LAW=FALSE
      1270 OUT OF    607179 REFLECTIONS REJECTED
    605909 REFLECTIONS ON OUTPUT FILE
******************************************************************************
DEFINITIONS:
R-FACTOR
observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))
expected = expected R-FACTOR derived from Sigma(I)
COMPARED = number of reflections used for calculating R-FACTOR
I/SIGMA  = mean of intensity/Sigma(I) of unique reflections
            (after merging symmetry-related observations)
Sigma(I) = standard deviation of reflection intensity I
            estimated from sample statistics
R-meas  = redundancy independent R-factor (intensities)
Rmrgd-F  = quality of amplitudes (F) in the scaled data set
            For definition of R-meas and Rmrgd-F see
            Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
Anomal  = mean correlation factor between two random subsets
  Corr      of anomalous intensity differences
SigAno  = mean anomalous difference in units of its estimated
            standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-)
            are structure factor estimates obtained from the
            merged intensity observations in each parity class.
  Nano    = Number of unique reflections used to calculate
            Anomal_Corr & SigAno. At least two observations
            for each (+ and -) parity are required.
      NOTE:      Friedel pairs are treated as different reflections.
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    9.40        6095    844      883      95.6%      2.0%      2.6%    6084  73.41    2.1%    0.9%    87%  2.706    313
    6.64      12006    1611      1621      99.4%      2.0%      2.8%    12004  68.81    2.1%    1.0%    84%  2.555    684
    5.43      15339    2065      2086      99.0%      2.2%      2.8%    15338  63.28    2.4%    1.2%    82%  2.409    908
    4.70      18697    2486      2498      99.5%      1.9%      2.6%    18694  70.84    2.1%    1.0%    75%  1.855    1120
    4.20      21080    2796      2821      99.1%      2.0%      2.7%    21078  66.87    2.1%    1.1%    67%  1.727    1270
    3.84      23300    3094      3117      99.3%      2.5%      3.0%    23297  58.10    2.7%    1.5%    64%  1.551    1420
    3.55      25676    3344      3366      99.3%      3.1%      3.6%    25676  48.56    3.4%    1.9%    50%  1.326    1548
    3.32      28013    3633      3653      99.5%      3.9%      4.3%    28011  41.76    4.1%    2.8%    37%  1.244    1687
    3.13      30254    3841      3848      99.8%      5.7%      6.0%    30252  32.18    6.1%    4.1%    35%  1.125    1796
    2.97      32595    4114      4118      99.9%      8.8%      9.1%    32594  23.53    9.4%    6.8%    26%  1.038    1925
    2.83      34368    4313      4320      99.8%      12.8%    13.3%    34366  17.65    13.6%    9.5%    21%  0.989    2030
    2.71      35627    4472      4478      99.9%      16.9%    17.4%    35625  14.15    18.1%    12.2%    18%  0.965    2108
    2.61      37300    4704      4710      99.9%      25.8%    26.4%    37297    9.70    27.6%    19.3%    16%  0.930    2223
    2.51      38975    4890      4896      99.9%      33.8%    34.9%    38975    7.68    36.1%    24.1%    14%  0.888    2315
    2.43      39971    5019      5027      99.8%      49.1%    50.8%    39967    5.47    52.5%    37.2%    8%  0.810    2380
    2.35      39968    5179      5222      99.2%      67.9%    67.5%    39960    4.07    72.7%    50.4%    25%  0.927    2445
    2.28      42067    5388      5423      99.4%      89.9%    94.3%    42063    3.03    96.2%    63.5%    16%  0.796    2548
    2.21      43011    5538      5541      99.9%      82.3%    83.3%    43010    3.16    88.1%    57.9%    14%  0.871    2644
    2.16      42577    5697      5703      99.9%    108.5%    112.2%    42574    2.37  116.6%    83.1%    3%  0.760    2720
    2.10      38988    5633      5912      95.3%    142.1%    144.2%    38936    1.67  153.5%  119.2%    6%  0.772    2638
    total      605907  78661    79243      99.3%      5.5%      6.1%  605801  21.72    5.9%    11.3%    27%  1.095  36722
</pre>
We not that the "CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS" are really high which confirms the hypothesis that this is a valid procedure to perform.
Comparison of the last table with that of the previous paragraph, i.e. without zero-dose extrapolation, shows that the I/sigma, the anomalous correlation coefficients and the SigAno are significantly higher. Does this translate into better structure solution? It does:
[[File:1y13-raddam-ccall-ccweak-raddam.png]]
[[File:1y13-raddam-site-occ-raddam.png]]
[[File:1y13-raddam-contrast-raddam.png]]
2,652

edits