1Y13: Difference between revisions

Jump to navigation Jump to search
25,056 bytes added ,  24 March 2020
m
no edit summary
No edit summary
mNo edit summary
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The structure is [http://www.rcsb.org/pdb/explore/explore.do?structureId=1Y13 deposited] in the PDB, solved with SAD and refined at a resolution of 2.2 A in spacegroup P4(3)2(1)2 (#96).
The structure is [http://www.rcsb.org/pdb/explore/explore.do?structureId=1Y13 deposited] in the PDB, solved with SAD and refined at a resolution of 2.2 A in spacegroup P4(3)2(1)2 (#96).
The data for this project were provided by Jürgen Bosch (SGPP) and are linked to [http://bl831.als.lbl.gov/example_data_sets/ACA2011/DPWTP-website/index.html the ACA 2011 workshop website].  
The data for this project were provided by Jürgen Bosch (SGPP) and are linked to [http://bl831.als.lbl.gov/example_data_sets/ACA2011/DPWTP-website/index.html the ACA 2011 workshop website] and [https://{{SERVERNAME}}/pub/xds-datared/1y13/ here].  
There are two high-resolution (2 Å) datasets E1 (wavelength 0.9794Å) and E2 (@ 0.9174Å) collected (with 0.25° increments) at an ALS beamline on June 27, 2004, and a weaker dataset collected earlier at a SSRL beamline. We will only use the former two datasets here.
There are two high-resolution (2 Å) datasets E1 (wavelength 0.9794Å) and E2 (@ 0.9174Å) collected (with 0.25° increments) at an ALS beamline on June 27, 2004, and a weaker dataset collected earlier at a SSRL beamline. We will only use the former two datasets here.


Line 58: Line 58:
       a        b          ISa
       a        b          ISa
  6.058E+00  3.027E-04  23.35
  6.058E+00  3.027E-04  23.35
 
 
  ...
  ...
   
   
Line 91: Line 90:
* the number of MISFITS is higher than 1%. From the first long table (fine-grained in resolution) table in CORRECT.LP we learn that the misfits are due to faint high-resolution ice rings - so this is a problem intrinsic to the data, and not to their mode of processing.  
* the number of MISFITS is higher than 1%. From the first long table (fine-grained in resolution) table in CORRECT.LP we learn that the misfits are due to faint high-resolution ice rings - so this is a problem intrinsic to the data, and not to their mode of processing.  


To my surprise, pointless does not agree with CORRECT's standpoint:
To my surprise, pointless ("pointless xdsin XDS_ASCII.HKL") does not agree with CORRECT's standpoint:
<pre>
<pre>
Scores for each symmetry element
Scores for each symmetry element
Line 190: Line 189:
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3):
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3):
[[File:1y13-e1-scales.png]]
[[File:1y13-e1-scales.png]]
demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers).  
demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers).  


Line 229: Line 229:
thus proving that both datasets were interrupted for 20 minutes around frame 370.
thus proving that both datasets were interrupted for 20 minutes around frame 370.


The really weird thing here is that both datasets appear to be collected at the same time, but at different wavelengths (E1 at 0.9794 Å, E2 at 0.9184 Å), and yet the individual parts merge as follows: using the following [[XSCALE.INP]]:
Interestingly, both datasets appear to be collected at the same time, but at different wavelengths (E1 at 0.9794 Å, E2 at 0.9184 Å), and yet the individual parts merge as follows: using the following XSCALE.INP:
  UNIT_CELL_CONSTANTS=103.316  103.316  131.456  90.000  90.000  90.000
  UNIT_CELL_CONSTANTS=103.316  103.316  131.456  90.000  90.000  90.000
  SPACE_GROUP_NUMBER=96
  SPACE_GROUP_NUMBER=96
Line 277: Line 277:
proving that the second parts of datasets E1 and E2 should be treated separately from the first parts.
proving that the second parts of datasets E1 and E2 should be treated separately from the first parts.


Upon inspection of the cell parameters, we find that the cell axes of the second "halfs" are shorter by a factor of 0.9908 when compared with the first parts. This suggests that they were collected at a longer wavelength! But then the wavelength values in the headers are most likely completely wrong: we can speculate that the two first parts were collected at the SeMet peak wavelength, and the two second parts at the inflection wavelength.  
Upon inspection of the cell parameters, we find that the cell axes of the second "halfs" are shorter by a factor of 0.9908 when compared with the first parts. This suggests that they were collected at a longer wavelength, or that radiation damage changed the cell parameters during the 20-minute break - usually it makes them longer (Ravelli ''et al.'' (2002), J. Synchrotron Rad. 9, 355-360), but this may be the exception to the rule! Maybe the crystal even was exposed to the beam during that time, in an attempt to try radiation-damage induced phasing (see e.g. Ravelli ''et al'' Structure 11 (2003), 217-220).
 
The almost-simultaneous DATEs in the headers may be explained by a wavelength-switching measuring strategy which alternatingly collects 4 frames at one wavelength as E1, then changes the wavelength and collects 4 frames into E2.
 
So this little detective work appears to give us useful information about what happened in the morning of Sunday June 27, 2004 at ALS beamline 821 - but some questions remain.
 
== Further analysis of datasets E1 and E2 ==
 
Here, we try to learn more about the constituents of "firstparts".
 
Running "[[xdsstat]] > XDSSTAT.LP" in the e1_1-372 and e2_1-369 directories, we obtain statistics output not available from CORRECT. We open XDSSTAT.LP with the CCP4 program "loggraph", and take a look at [[misfits.pck]], [[rf.pck]], and the other files produced by [[xdsstat]], using [[VIEW]] or [[XDS-Viewer]]:
 
[[File:e1_1-372-xdsstat1.png]]


The almost-simultaneous DATEs in the headers may be explained by an inverse-beam measuring strategy which alternatingly collects 4 frames in one orientation as E1, then rotates the spindle by 180° and collects 4 frames into E2. The beamline software
Reflections and misfits, by frame - looks normal


So this little detective work appears to tell us what happened in the morning of Sunday June 27, 2004 at ALS beamline 821.
[[File:e1_1-372-xdsstat2.png]]


Intensity and sigma by frame - looks normal


[[File:e1_1-372-xdsstat3.png]]


"partiality" and profile agreement, by frame - looks good but it's clear that the profiles at high frame number agree worse with the average profiles, possibly due to radiation damage
[[File:e1_1-372-xdsstat4.png]]
R_meas, by frame, clearly showing good R_meas in the middle of the dataset
[[File:e1_1-372-xdsstat-raddam.png]]
R_d - an R-factor which directly depends on radiation damage. This is calculated as a function of frame number difference and the linear rise indicates significant radiation damage that should be correctable in [[XSCALE]], using the CRYSTAL_NAME keyword.
[[File:e1_1-372-misfits.png]]
misfits mapped on the detector, showing ice rings.
[[File:e1_1-372-rf.png]]
R_meas mapped on the detector, showing elevated R_meas at the location of the ice rings.
== Solving the structure with pseudo-SAD ==
It appears reasonable to discard the "second parts" since they are strongly influenced by radiation damage. Then, we could
# merge together (into one output file) the two first parts of E1 and E2, thus obtaining a single pseudo-SAD dataset. The reason for doing this is that the anomalous signal of both datasets is so strong, and their (isomorphous) difference is weak (after all, the correlation coefficient is 1.000 !)
# keep the first parts of E1 (inflection, according to the documentation) and E2 (high-enery remote) separate, and treat them as MAD (or rather, DAD).
=== First try ===
Let's look at the XSCALE statistics for the merged-together "firstparts":
      NOTE:      Friedel pairs are treated as different reflections.
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    9.40        6122    844      883      95.6%      2.9%      3.5%    6111  54.76    3.2%    1.4%    79%  2.137    313
    6.64      12037    1611      1621      99.4%      2.9%      3.6%    12035  51.54    3.1%    1.5%    80%  2.259    684
    5.43      15348    2065      2086      99.0%      3.5%      3.7%    15347  47.79    3.7%    1.7%    78%  2.294    908
    4.70      18714    2487      2498      99.6%      3.0%      3.7%    18711  49.55    3.2%    1.5%    72%  1.712    1120
    4.20      21104    2797      2821      99.1%      3.1%      3.7%    21102  47.24    3.3%    1.7%    72%  1.727    1271
    3.84      23316    3095      3117      99.3%      3.8%      4.0%    23313  42.74    4.1%    2.1%    65%  1.617    1420
    3.55      25693    3345      3366      99.4%      4.4%      4.5%    25693  37.93    4.7%    2.6%    50%  1.411    1548
    3.32      28017    3633      3653      99.5%      5.2%      5.2%    28015  32.89    5.6%    3.6%    40%  1.335    1687
    3.13      30266    3842      3848      99.8%      7.2%      7.2%    30264  25.87    7.7%    4.8%    36%  1.158    1797
    2.97      32595    4114      4118      99.9%      10.4%    10.4%    32594  19.26    11.1%    7.7%    30%  1.068    1925
    2.83      34384    4315      4320      99.9%      14.3%    14.8%    34382  14.88    15.3%    10.3%    20%  0.937    2031
    2.71      35654    4475      4478      99.9%      18.3%    19.1%    35652  12.13    19.5%    13.1%    15%  0.891    2110
    2.61      37307    4705      4710      99.9%      27.5%    28.8%    37304    8.44    29.4%    19.8%    11%  0.834    2224
    2.51      38997    4893      4896      99.9%      35.5%    38.0%    38997    6.78    38.0%    26.0%    10%  0.817    2318
    2.43      40036    5026      5027      100.0%      51.3%    55.1%    40032    4.92    54.8%    38.0%    2%  0.738    2387
    2.35      39975    5180      5222      99.2%      71.3%    68.9%    39967    3.78    76.4%    52.7%    21%  0.887    2446
    2.28      42041    5385      5423      99.3%      93.7%    93.1%    42037    2.90  100.3%    66.7%    11%  0.798    2548
    2.21      43012    5538      5541      99.9%      85.7%    88.3%    43011    2.87    91.8%    58.8%    10%  0.818    2644
    2.16      42610    5701      5703      100.0%    113.6%    120.7%    42607    2.13  122.0%    85.4%    4%  0.722    2724
    2.10      38996    5634      5912      95.3%    146.1%    153.9%    38944    1.50  157.8%  122.7%    3%  0.711    2639
    total      606224  78685    79243      99.3%      6.7%      7.2%  606118  16.88    7.2%    12.0%    29%  1.055  36744
The anomalous correlation is good at low resolution, though not outstanding. At high resolution it rises again but this is presumably due to the ice rings.
I like to use [[ccp4com:hkl2map|hkl2map]] which runs [[ccp4com:SHELX C/D/E|SHELXC]], [[ccp4com:SHELX C/D/E|SHELXD]] and [[ccp4com:SHELX C/D/E|SHELXE]] from its GUI. Before doing so, we have to run XDSCONV with the following XDSCONV.INP:
INPUT_FILE=firstparts.hkl
OUTPUT_FILE=temp.hkl SHELX
First, the shelxc output which shows that these data are quite good:
[[File:e1+e2_firstparts-i-sigi-resol.png]] [[File:e1+e2_firstparts-self-anomcc.png]]
And then we show the result of 100 trials at substructure solution of shelxd, trying to find 3 Se atoms at 30 - 3.3Å resolution (I also tried 3.0 3.1 3.2 3.4 3.5 Å but 3.3 Å was best).
[[File:e1+e2_firstparts-ccall-ccweak.png]] [[File:e1+e2_firstparts-occ-vs-peak.png]]
This looks reasonable although the absolute value of CCall is so low that there is little hope that the structure can be solved with this amount of information. And indeed, SHELXE did not show a difference between the two hands (in fact we even know that the "original hand" is the correct one since the inverted had would correspond to spacegroup #92 !).
=== Second try: correcting radiation damage by 0-dose extrapolation ===
Since we noted significant radiation damage we could try to correct that. All we have to do is ask XSCALE to do zero-dose extrapolation:
<pre>
UNIT_CELL_CONSTANTS=103.316  103.316  131.456  90.000  90.000  90.000
SPACE_GROUP_NUMBER=96
OUTPUT_FILE=temp.ahkl
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL
CRYSTAL_NAME=a
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL
CRYSTAL_NAME=a
</pre>
As a result we obtain in XSCALE.LP:
<pre>
******************************************************************************
          RESULTS FROM ZERO-DOSE EXTRAPOLATION OF REFLECTION INTENSITIES
                      for reference on this subject see:
K. Diederichs, S. McSweeney & R.B.G. Ravelli, Acta Cryst. D59, 903-909(2003).
"Zero-dose extrapolation as part of macromolecular synchrotron data reduction"
******************************************************************************
Radiation damage can lead to localized modifications of the structure.
To correct for this effect, XSCALE modifies the intensity measurements
I(h,i) by individual correction factors,
                      exp{-b(h)*dose(h,i)}
where h,i denotes the i-th observation with unique reflection indices
h, and dose(h,i) the X-ray dose accumulated by the crystal when the
reflection was recorded. Assuming a constant dose for each image
(dose_rate), the accumulated dose when recording image_number(i), on
which I(h,i) was observed, is then
dose(h,i) = starting_dose + dose_rate * (image_number(i)-first_image)
The decay factor b(h) is determined from the assumption that symmetry
related reflections in a data set taken from the same crystal should
have the same intensity after correction. Moreover, b(h) is assumed to
be the same for Friedel-pairs and independent of the X-ray wavelength.
To avoid overfitting the data, XSCALE starts with the hypothesis that
b(h)=0 and rejects this assumption if its probability is below 10.0%.
CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS
-----------------------------------------------------------
First  INPUT_FILE= ../e2_1-369/XDS_ASCII.HKL                       
      CRYSTAL_NAME= a                                               
Second INPUT_FILE= ../e1_1-372/XDS_ASCII.HKL                       
      CRYSTAL_NAME= a                                               
RESOLUTION    NUMBER    CORRELATION
  LIMIT      OF PAIRS      FACTOR
    9.40        210        0.955
    6.64        441        0.955
    5.43        587        0.940
    4.70        692        0.969
    4.20        750        0.949
    3.84        836        0.920
    3.55        809        0.942
    3.32        775        0.925
    3.13        663        0.888
    2.97        557        0.837
    2.83        375        0.681
    2.71        302        0.812
    2.61        212        0.625
    2.51        163        0.508
    2.43          95        0.291
    2.35        139        0.722
    2.28        110        0.688
    2.21          91        0.734
    2.16          88        0.561
    2.10          54        0.126
    total        7949        0.788
          X-RAY DOSE PARAMETERS USED FOR EACH INPUT DATA SET
          --------------------------------------------------
CRYSTAL_NAME= a                                               
        STARTING_DOSE            DOSE_RATE      NAME OF INPUT FILE
    initial    refined      initial    refined
  0.000E+00  8.557E+00  1.000E+00  1.000E+00  ../e1_1-372/XDS_ASCII.HKL                       
  0.000E+00  0.000E+00  1.000E+00  1.024E+00  ../e2_1-369/XDS_ASCII.HKL                       
          STATISTICS OF 0-DOSE CORRECTED DATA FROM EACH CRYSTAL
          -----------------------------------------------------
NUNIQUE = Number of unique reflections with enough symmetry-
          related observations to determine a decay factor b(h)
N0-DOSE = Number of 0-dose extrapolated unique reflections
NERROR  = Number of unique extrapolated reflections expected
          to be overfitted. A large ratio of N0-DOSE/NERROR
          justifies the data correction as carried out here.
S_corr  = mean value of Sigma(I) for 0-dose extrapolated data
S_norm  = mean value of Sigma(I) for the same data but
          without 0-dose extrapolation.
NFREE  = degrees of freedom for calculating S_corr
CRYSTAL_NAME= a                                               
RESOLUTION  NUNIQUE  N0-DOSE  N0-DOSE/  S_corr/    NFREE
  LIMIT                        NERROR    S_norm
    9.40      496    378      68.0      0.543    3180
    6.64      908    703      78.9      0.554    6245
    5.43      1140    894      77.0      0.574    8064
    4.70      1351    1040      77.4      0.599    9671
    4.20      1518    1133      69.9      0.620    10585
    3.84      1665    1187      73.9      0.630    11129
    3.55      1787    1220      65.1      0.671    11917
    3.32      1941    1289      58.1      0.690    12728
    3.13      2042    1172      49.8      0.717    11877
    2.97      2182    1103      48.1      0.750    11498
    2.83      2281    911      40.1      0.798    9662
    2.71      2352    812      34.2      0.825    8611
    2.61      2467    702      34.1      0.848    7383
    2.51      2566    627      31.5      0.875    6595
    2.43      2624    499      31.2      0.895    5295
    2.35      2709    629      31.6      0.888    6240
    2.28      2821    603      28.5      0.893    6147
    2.21      2880    560      32.4      0.905    5758
    2.16      2959    448      30.3      0.907    4394
    2.10      2860    413      29.9      0.924    3745
    total    41549  16323      46.8      0.739  160724
******************************************************************************
              SCALING FACTORS FOR Sigma(I) AS FUNCTION OF RESOLUTION
******************************************************************************
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e1_1-372/XDS_ASCII.HKL                       
                                  RESOLUTION (ANGSTROM) 
        10.33  6.12  4.76  4.03  3.56  3.23  2.97  2.76  2.60  2.46  2.34  2.23  2.14
FACTOR  0.94  0.96  0.88  0.93  0.99  0.98  0.99  0.99  0.99  0.98  1.10  1.00  0.99
SCALING FACTORS FOR Sigma(I) FOR DATA SET ../e2_1-369/XDS_ASCII.HKL                       
                                  RESOLUTION (ANGSTROM) 
        10.32  6.11  4.76  4.03  3.56  3.22  2.97  2.76  2.60  2.46  2.34  2.23  2.14
FACTOR  0.96  0.98  0.89  0.94  1.01  1.01  1.02  1.01  1.00  0.99  1.11  1.02  0.98
******************************************************************************
  STATISTICS OF SCALED OUTPUT DATA SET : temp.ahkl                                       
  FILE TYPE:        XDS_ASCII      MERGE=FALSE          FRIEDEL'S_LAW=FALSE
      1270 OUT OF    607179 REFLECTIONS REJECTED
    605909 REFLECTIONS ON OUTPUT FILE
******************************************************************************
DEFINITIONS:
R-FACTOR
observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))
expected = expected R-FACTOR derived from Sigma(I)
COMPARED = number of reflections used for calculating R-FACTOR
I/SIGMA  = mean of intensity/Sigma(I) of unique reflections
            (after merging symmetry-related observations)
Sigma(I) = standard deviation of reflection intensity I
            estimated from sample statistics
R-meas  = redundancy independent R-factor (intensities)
Rmrgd-F  = quality of amplitudes (F) in the scaled data set
            For definition of R-meas and Rmrgd-F see
            Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
Anomal  = mean correlation factor between two random subsets
  Corr      of anomalous intensity differences
SigAno  = mean anomalous difference in units of its estimated
            standard deviation (|F(+)-F(-)|/Sigma). F(+), F(-)
            are structure factor estimates obtained from the
            merged intensity observations in each parity class.
  Nano    = Number of unique reflections used to calculate
            Anomal_Corr & SigAno. At least two observations
            for each (+ and -) parity are required.
      NOTE:      Friedel pairs are treated as different reflections.
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    9.40        6095    844      883      95.6%      2.0%      2.6%    6084  73.41    2.1%    0.9%    87%  2.706    313
    6.64      12006    1611      1621      99.4%      2.0%      2.8%    12004  68.81    2.1%    1.0%    84%  2.555    684
    5.43      15339    2065      2086      99.0%      2.2%      2.8%    15338  63.28    2.4%    1.2%    82%  2.409    908
    4.70      18697    2486      2498      99.5%      1.9%      2.6%    18694  70.84    2.1%    1.0%    75%  1.855    1120
    4.20      21080    2796      2821      99.1%      2.0%      2.7%    21078  66.87    2.1%    1.1%    67%  1.727    1270
    3.84      23300    3094      3117      99.3%      2.5%      3.0%    23297  58.10    2.7%    1.5%    64%  1.551    1420
    3.55      25676    3344      3366      99.3%      3.1%      3.6%    25676  48.56    3.4%    1.9%    50%  1.326    1548
    3.32      28013    3633      3653      99.5%      3.9%      4.3%    28011  41.76    4.1%    2.8%    37%  1.244    1687
    3.13      30254    3841      3848      99.8%      5.7%      6.0%    30252  32.18    6.1%    4.1%    35%  1.125    1796
    2.97      32595    4114      4118      99.9%      8.8%      9.1%    32594  23.53    9.4%    6.8%    26%  1.038    1925
    2.83      34368    4313      4320      99.8%      12.8%    13.3%    34366  17.65    13.6%    9.5%    21%  0.989    2030
    2.71      35627    4472      4478      99.9%      16.9%    17.4%    35625  14.15    18.1%    12.2%    18%  0.965    2108
    2.61      37300    4704      4710      99.9%      25.8%    26.4%    37297    9.70    27.6%    19.3%    16%  0.930    2223
    2.51      38975    4890      4896      99.9%      33.8%    34.9%    38975    7.68    36.1%    24.1%    14%  0.888    2315
    2.43      39971    5019      5027      99.8%      49.1%    50.8%    39967    5.47    52.5%    37.2%    8%  0.810    2380
    2.35      39968    5179      5222      99.2%      67.9%    67.5%    39960    4.07    72.7%    50.4%    25%  0.927    2445
    2.28      42067    5388      5423      99.4%      89.9%    94.3%    42063    3.03    96.2%    63.5%    16%  0.796    2548
    2.21      43011    5538      5541      99.9%      82.3%    83.3%    43010    3.16    88.1%    57.9%    14%  0.871    2644
    2.16      42577    5697      5703      99.9%    108.5%    112.2%    42574    2.37  116.6%    83.1%    3%  0.760    2720
    2.10      38988    5633      5912      95.3%    142.1%    144.2%    38936    1.67  153.5%  119.2%    6%  0.772    2638
    total      605907  78661    79243      99.3%      5.5%      6.1%  605801  21.72    5.9%    11.3%    27%  1.095  36722
</pre>
We note that the "CORRELATION OF COMMON DECAY-FACTORS BETWEEN INPUT DATA SETS" are really high which confirms the hypothesis that this is a valid procedure to perform.
Comparison of the last table with that of the previous paragraph, i.e. without zero-dose extrapolation, shows that the I/sigma, the anomalous correlation coefficients and the SigAno are significantly higher. Does this translate into better structure solution? It does:
[[File:1y13-raddam-ccall-ccweak-raddam.png]]
[[File:1y13-raddam-site-occ-raddam.png]]
[[File:1y13-raddam-contrast-raddam.png]]
=== Automatically building the main chain of 452 out of 519 residues ===
Based on the sites obtained by SHELXD, we run
shelxe.beta -a -q -h -b -s0.585 -m40 raddam raddam_fa
This already builds a significant number of residues, but also gives an improved list of heavy atom sites - there are actually 6 sites instead of the 5 that SHELXD wrote out (yes, we had asked SHELXD for 3 sites since there are 3 Met residues, but SHELXD as always was smarter than we are). We "mv raddam.hat raddam_fa.res" for another run of SHELXE:
shelxe.beta -a -q -h6 -b -s0.585 -m40 -n3 raddam raddam_fa
and get
<pre>
  452 residues left after pruning, divided into chains as follows:
A:  15  B:  5  C:  22  D:  22  E:  27  F:  62  G: 263  H:  36
CC for partial structure against native data =  39.83 %
------------------------------------------------------------------------------
Global autotracing cycle  4
<wt> = 0.300, Contrast = 0.447, Connect. = 0.705 for dens.mod. cycle 1
<wt> = 0.300, Contrast = 0.660, Connect. = 0.781 for dens.mod. cycle 2
<wt> = 0.300, Contrast = 0.723, Connect. = 0.801 for dens.mod. cycle 3
<wt> = 0.300, Contrast = 0.762, Connect. = 0.807 for dens.mod. cycle 4
Pseudo-free CC = 64.88 %
<wt> = 0.300, Contrast = 0.785, Connect. = 0.810 for dens.mod. cycle 5
<wt> = 0.300, Contrast = 0.806, Connect. = 0.813 for dens.mod. cycle 6
<wt> = 0.300, Contrast = 0.820, Connect. = 0.815 for dens.mod. cycle 7
<wt> = 0.300, Contrast = 0.831, Connect. = 0.817 for dens.mod. cycle 8
<wt> = 0.300, Contrast = 0.839, Connect. = 0.819 for dens.mod. cycle 9
Pseudo-free CC = 69.74 %
<wt> = 0.300, Contrast = 0.845, Connect. = 0.820 for dens.mod. cycle 10
<wt> = 0.300, Contrast = 0.849, Connect. = 0.821 for dens.mod. cycle 11
<wt> = 0.300, Contrast = 0.851, Connect. = 0.822 for dens.mod. cycle 12
<wt> = 0.300, Contrast = 0.853, Connect. = 0.823 for dens.mod. cycle 13
<wt> = 0.300, Contrast = 0.854, Connect. = 0.823 for dens.mod. cycle 14
Pseudo-free CC = 70.80 %
<wt> = 0.300, Contrast = 0.854, Connect. = 0.824 for dens.mod. cycle 15
<wt> = 0.300, Contrast = 0.855, Connect. = 0.824 for dens.mod. cycle 16
<wt> = 0.300, Contrast = 0.855, Connect. = 0.824 for dens.mod. cycle 17
<wt> = 0.300, Contrast = 0.854, Connect. = 0.824 for dens.mod. cycle 18
<wt> = 0.300, Contrast = 0.854, Connect. = 0.824 for dens.mod. cycle 19
Pseudo-free CC = 71.03 %
<wt> = 0.300, Contrast = 0.854, Connect. = 0.824 for dens.mod. cycle 20
Estimated mean FOM and mapCC as a function of resolution
d    inf - 4.62 - 3.64 - 3.17 - 2.88 - 2.67 - 2.51 - 2.38 - 2.27 - 2.18 - 2.11
<FOM>  0.736  0.786  0.768  0.721  0.701  0.681  0.618  0.595  0.587  0.540
<mapCC> 0.862  0.932  0.946  0.934  0.924  0.924  0.922  0.913  0.882  0.858
N        4206  4227  4214  4135  4185  4207  4292  4406  4320  3702
Estimated mean FOM = 0.674  Pseudo-free CC = 71.18 %
Density (in map sigma units) at input heavy atom sites
  Site    x        y        z    occ*Z    density
    1  0.2276  0.7578  0.1189  34.0000    29.98
    2  0.1568  0.6345  0.3049  32.2898    30.44
    3  0.1767  0.5344  0.2160  32.2388    29.67
    4  0.3059  0.4535  0.1297  26.0746    23.51
    5  0.0280  0.8243  0.1410  22.7324    21.02
    6  0.0383  0.9748  0.0492  21.5050    21.18
Site    x      y      z  h(sig) near old  near new
  1  0.1569  0.6345  0.3048  30.4  2/0.02  9/13.36 3/15.73 2/19.52 7/22.13
  2  0.2278  0.7578  0.1188  30.0  1/0.02  1/19.52 6/21.97 7/22.48 9/25.02
  3  0.1767  0.5345  0.2158  29.7  3/0.03  9/2.90 1/15.73 4/19.45 2/26.88
  4  0.3060  0.4536  0.1292  23.5  4/0.07  3/19.45 9/21.16 8/26.49 5/26.83
  5  0.0382  0.9748  0.0490  21.2  6/0.02  8/2.63 8/15.66 5/15.88 6/19.80
  6  0.0278  0.8240  0.1416  21.1  5/0.08  5/19.80 8/21.59 7/21.87 2/21.97
  7  0.1854  0.9571  0.1787  -5.0  5/21.86  6/21.87 1/22.13 2/22.48 8/22.57
  8  0.0427  0.9993  0.0530  -5.0  6/2.62  5/2.63 8/15.31 5/15.66 6/21.59
  9  0.1787  0.5611  0.2228  -4.7  3/2.91  3/2.90 1/13.36 4/21.16 2/25.02
</pre>
At this point the structure is obviously solved, and we could use buccaneer or Arp/wArp to add side chains and the rest of the model. 3-fold NCS surely helps!
=== Could we do better? ===
Yes, of course (as always). I can think of four things to try:
* an [[optimization]] round of running xds for the two datasets
* using a negative offset for STARTING_DOSE in XSCALE.INP, as documented in the [[XSCALE]] wiki article.
* use MERGE=TRUE in XDSCONV.INP. I tried it and this gives 20 solutions with CCall+CCweak > 25 out of 1000 trials, whereas MERGE=FALSE (the default) gives only 4 solutions! Update Sep 2011: the [[ccp4com:SHELX_C/D/E#Obtaining_the_SHELX_programs|beta-test version]] of SHELXC should have a fix for this.
== better phases from DAD (Double Anomalous Dispersion) ==
The reason why pseudo-SAD is described here first is that, historically, I did it first since I thought that the wavelength could not realistically be changed within 3 seconds, and I therefore thought that the headers were wrong and this would not actually be a two-wavelength experiment. Along these lines, I interpreted the correlation coefficient of 1.0 between the E1 and E2 first parts as indicating that no isomorphous difference exists.
In a discussion with Gerard Bricogne and Clemens Vonrhein after the ACA2011 workshop it turned out that my theory, which claims that E1 and E2 are actually the same wavelength, is wrong. This was investigated by looking at the difference map (obtained using phenix.fobs_minus_fobs_map) of E1 and E2 (taking the first parts in each case) phased with the 1y13 model, which shows three strong (14-19 sigma) peaks. The fact that the 1-370 pieces merge so well seems to be a consequence of the fact that the anomalous signal of the two wavelengths is so similar, and the dispersive difference between the wavelengths does not significantly decrease the high correlation coefficient in data scaling.


Although
Thus even better phasing would be obtained by keeping the wavelengths separate and doing MAD (in fact DAD) - but zero-dose extrapolation could and should be done in the same way. I've therefore continued the analysis in [[1Y13-DAD]].
Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu