1Y13: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
No edit summary
Line 190: Line 190:
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3):
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3):
[[File:1y13-e1-scales.png]]
[[File:1y13-e1-scales.png]]
demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers).  
demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers).  


Line 289: Line 290:
Running "[[xdsstat]] > XDSSTAT.LP" in the e1_1-372 and e2_1-369 directories, we obtain statistics output not available from CORRECT. We open XDSSTAT.LP with the CCP4 program "loggraph", and take a look at [[misfits.pck]], [[rf.pck]], and the other files produced by [[xdsstat]], using [[VIEW]] or [[XDS-Viewer]]:
Running "[[xdsstat]] > XDSSTAT.LP" in the e1_1-372 and e2_1-369 directories, we obtain statistics output not available from CORRECT. We open XDSSTAT.LP with the CCP4 program "loggraph", and take a look at [[misfits.pck]], [[rf.pck]], and the other files produced by [[xdsstat]], using [[VIEW]] or [[XDS-Viewer]]:


[[File:e1_1-372-xdsstat1.png]] Reflections and misfits, by frame - looks normal
[[File:e1_1-372-xdsstat1.png]]  
[[File:e1_1-372-xdsstat2.png]] Intensity and sigma by frame - looks normal
 
[[File:e1_1-372-xdsstat3.png]] "partiality" and profile agreement, by frame - looks good but it's clear that the profiles at high frame number agree worse with the average profiles, possibly due to radiation damage
Reflections and misfits, by frame - looks normal
[[File:e1_1-372-xdsstat4.png]] R_meas, by frame, clearly showing good R_meas in the middle of the dataset.
 
[[File:e1_1-372-xdsstat-raddam.png]] R_d - an R-factor which directly depends on radiation damage. This is calculated as a function of frame number difference and the linear rise indicates significant radiation damage that should be correctable in [[XSCALE]], using the CRYSTAL_NAME keyword.
[[File:e1_1-372-xdsstat2.png]]  
[[File:e1_1-372-misfits.png]] misfits mapped on the detector, showing ice rings.
 
[[File:e1_1-372-rf.png]] R_meas mapped on the detector, showing elevated R_meas at the location of the ice rings.
Intensity and sigma by frame - looks normal
 
[[File:e1_1-372-xdsstat3.png]]  
 
"partiality" and profile agreement, by frame - looks good but it's clear that the profiles at high frame number agree worse with the average profiles, possibly due to radiation damage
 
[[File:e1_1-372-xdsstat4.png]]  
 
R_meas, by frame, clearly showing good R_meas in the middle of the dataset
 
[[File:e1_1-372-xdsstat-raddam.png]]
 
R_d - an R-factor which directly depends on radiation damage. This is calculated as a function of frame number difference and the linear rise indicates significant radiation damage that should be correctable in [[XSCALE]], using the CRYSTAL_NAME keyword.
 
[[File:e1_1-372-misfits.png]]  
 
misfits mapped on the detector, showing ice rings.
 
[[File:e1_1-372-rf.png]]  
 
R_meas mapped on the detector, showing elevated R_meas at the location of the ice rings.


== Solving the structure ==
== Solving the structure ==
Line 301: Line 322:
Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.
Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.


To make sure we haven't overlooked anything
Let's look at the XSCALE statistics for "firstparts":
 
      NOTE:      Friedel pairs are treated as different reflections.
 
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE  POSSIBLE    OF DATA  observed  expected                                      Corr
    9.40        6122    844      883      95.6%      2.9%      3.5%    6111  54.76    3.2%    1.4%    79%  2.137    313
    6.64      12037    1611      1621      99.4%      2.9%      3.6%    12035  51.54    3.1%    1.5%    80%  2.259    684
    5.43      15348    2065      2086      99.0%      3.5%      3.7%    15347  47.79    3.7%    1.7%    78%  2.294    908
    4.70      18714    2487      2498      99.6%      3.0%      3.7%    18711  49.55    3.2%    1.5%    72%  1.712    1120
    4.20      21104    2797      2821      99.1%      3.1%      3.7%    21102  47.24    3.3%    1.7%    72%  1.727    1271
    3.84      23316    3095      3117      99.3%      3.8%      4.0%    23313  42.74    4.1%    2.1%    65%  1.617    1420
    3.55      25693    3345      3366      99.4%      4.4%      4.5%    25693  37.93    4.7%    2.6%    50%  1.411    1548
    3.32      28017    3633      3653      99.5%      5.2%      5.2%    28015  32.89    5.6%    3.6%    40%  1.335    1687
    3.13      30266    3842      3848      99.8%      7.2%      7.2%    30264  25.87    7.7%    4.8%    36%  1.158    1797
    2.97      32595    4114      4118      99.9%      10.4%    10.4%    32594  19.26    11.1%    7.7%    30%  1.068    1925
    2.83      34384    4315      4320      99.9%      14.3%    14.8%    34382  14.88    15.3%    10.3%    20%  0.937    2031
    2.71      35654    4475      4478      99.9%      18.3%    19.1%    35652  12.13    19.5%    13.1%    15%  0.891    2110
    2.61      37307    4705      4710      99.9%      27.5%    28.8%    37304    8.44    29.4%    19.8%    11%  0.834    2224
    2.51      38997    4893      4896      99.9%      35.5%    38.0%    38997    6.78    38.0%    26.0%    10%  0.817    2318
    2.43      40036    5026      5027      100.0%      51.3%    55.1%    40032    4.92    54.8%    38.0%    2%  0.738    2387
    2.35      39975    5180      5222      99.2%      71.3%    68.9%    39967    3.78    76.4%    52.7%    21%  0.887    2446
    2.28      42041    5385      5423      99.3%      93.7%    93.1%    42037    2.90  100.3%    66.7%    11%  0.798    2548
    2.21      43012    5538      5541      99.9%      85.7%    88.3%    43011    2.87    91.8%    58.8%    10%  0.818    2644
    2.16      42610    5701      5703      100.0%    113.6%    120.7%    42607    2.13  122.0%    85.4%    4%  0.722    2724
    2.10      38996    5634      5912      95.3%    146.1%    153.9%    38944    1.50  157.8%  122.7%    3%  0.711    2639
    total      606224  78685    79243      99.3%      6.7%      7.2%  606118  16.88    7.2%    12.0%    29%  1.055  36744
 
The anomalous correlation is good at low resolution, though not outstanding. At high resolution it rises again but this is presumably due to the ice rings.

Revision as of 15:40, 17 March 2011

The structure is deposited in the PDB, solved with SAD and refined at a resolution of 2.2 A in spacegroup P4(3)2(1)2 (#96). The data for this project were provided by Jürgen Bosch (SGPP) and are linked to the ACA 2011 workshop website. There are two high-resolution (2 Å) datasets E1 (wavelength 0.9794Å) and E2 (@ 0.9174Å) collected (with 0.25° increments) at an ALS beamline on June 27, 2004, and a weaker dataset collected earlier at a SSRL beamline. We will only use the former two datasets here.

Dataset E1

Use generate_XDS.INP and run xds once. Based on R-factors in the resulting CORRECT.LP, and an inspection of BKGPIX.cbf, I modified XDS.INP to have

INCLUDE_RESOLUTION_RANGE=40 2.1                       ! too weak beyond 2.1 Å
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=8000. 30000.  ! raised from 7000 30000 to mask beamstop

and ran xds again.

What's the problem?

This is the excerpt from CORRECT.LP :

SPACE-GROUP         UNIT CELL CONSTANTS            UNIQUE   Rmeas  COMPARED  LATTICE-
  NUMBER      a      b      c   alpha beta gamma                            CHARACTER

      5     145.8  145.7  131.4  90.0  90.0  90.0    9735    24.5    23176    10 mC
     75     103.1  103.1  131.4  90.0  90.0  90.0    5262    23.4    27649    11 tP
     89     103.1  103.1  131.4  90.0  90.0  90.0    2911    22.8    30000    11 tP
     21     145.7  145.8  131.4  90.0  90.0  90.0    5270    23.2    27641    13 oC
      5     145.7  145.8  131.4  90.0  90.0  90.0    9681    24.2    23230    14 mC
      1     102.9  103.2  131.4  90.0  90.0  89.9   18040     6.9    14871    31 aP
  *  16     102.9  103.2  131.4  90.0  90.0  90.0    5568     9.1    27343    32 oP
      3     103.2  102.9  131.4  90.0  90.0  90.0   10536     9.5    22375    35 mP
      3     102.9  103.2  131.4  90.0  90.0  90.0   10496     8.3    22415    33 mP
      3     102.9  131.4  103.2  90.0  90.1  90.0    9770     7.3    23141    34 mP
      1     102.9  103.2  131.4  90.0  90.0  90.1   18040     6.9    14871    44 aP

...

REFINED PARAMETERS:  DISTANCE BEAM ORIENTATION CELL AXIS                   
USING  219412 INDEXED SPOTS
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     1.01
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.11
CRYSTAL MOSAICITY (DEGREES)     0.191
DIRECT BEAM COORDINATES (REC. ANGSTROEM)  -0.004789  0.003758  1.021015
DETECTOR COORDINATES (PIXELS) OF DIRECT BEAM    1027.25   1064.20
DETECTOR ORIGIN (PIXELS) AT                     1036.84   1056.68
CRYSTAL TO DETECTOR DISTANCE (mm)       209.38
LAB COORDINATES OF DETECTOR X-AXIS  1.000000  0.000000  0.000000
LAB COORDINATES OF DETECTOR Y-AXIS  0.000000  1.000000  0.000000
LAB COORDINATES OF ROTATION AXIS  0.999997  0.000527  0.002187
COORDINATES OF UNIT CELL A-AXIS    21.922    52.895    85.337
COORDINATES OF UNIT CELL B-AXIS     3.771    87.158   -54.992
COORDINATES OF UNIT CELL C-AXIS  -128.130    18.914    21.191
REC. CELL PARAMETERS   0.009731  0.009697  0.007620  90.000  90.000  90.000
UNIT CELL PARAMETERS    102.766   103.125   131.241  90.000  90.000  90.000
E.S.D. OF CELL PARAMETERS  1.3E-01 8.6E-02 9.3E-02 0.0E+00 0.0E+00 0.0E+00
SPACE GROUP NUMBER     16

So CORRECT chooses an orthorhombic spacegroup.

The file continues:

...
     a        b          ISa
6.058E+00  3.027E-04   23.35


...

      NOTE:      Friedel pairs are treated as different reflections.

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    6.23       17389    5807      6045       96.1%       2.4%      2.8%    17277   35.83     3.0%     2.0%    66%   1.553    2434
    4.43       32116   10536     10787       97.7%       2.7%      3.0%    32057   33.78     3.3%     2.4%    55%   1.272    4762
    3.62       41900   13700     13961       98.1%       3.4%      3.4%    41793   27.98     4.1%     3.6%    38%   1.115    6295
    3.14       51146   16371     16513       99.1%       5.4%      5.3%    50967   18.89     6.6%     7.2%    20%   0.961    7625
    2.81       59159   18627     18675       99.7%      12.7%     13.2%    58877    9.82    15.4%    18.0%     8%   0.818    8716
    2.56       65525   20596     20651       99.7%      28.5%     30.2%    65130    5.19    34.5%    40.4%     3%   0.757    9629
    2.37       71579   22491     22533       99.8%      62.6%     67.1%    71068    2.60    75.6%    88.8%     1%   0.694   10498
    2.22       74065   23837     24094       98.9%      97.9%     97.0%    73444    1.59   118.8%   139.8%    11%   0.738   11051
    2.09       65776   24379     25674       95.0%     133.3%    140.6%    63647    0.90   166.4%   216.0%     1%   0.651   10380
   total      478655  156344    158933       98.4%       6.5%      6.8%   474260   10.65     7.9%    22.5%    16%   0.852   71390


NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  492346
NUMBER OF REJECTED MISFITS                           13342
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
NUMBER OF ACCEPTED OBSERVATIONS                     479004
NUMBER OF UNIQUE ACCEPTED REFLECTIONS               157108

Some comments:

  • the "STANDARD DEVIATION OF SPOT POSITION (PIXELS)" is significantly higher (1.01) than those reported for the 5°-batches in INTEGRATE.LP (about 0.6) . This suggests that the geometry refinement has to deal with inconsistent data.
  • CORRECT obviously indicates an orthorhombic spacegroup.
  • the number of MISFITS is higher than 1%. From the first long table (fine-grained in resolution) table in CORRECT.LP we learn that the misfits are due to faint high-resolution ice rings - so this is a problem intrinsic to the data, and not to their mode of processing.

To my surprise, pointless does not agree with CORRECT's standpoint:

Scores for each symmetry element
 
Nelmt  Lklhd  Z-cc    CC        N  Rmeas    Symmetry & operator (in Lattice Cell)

  1   0.959   9.91   0.99   65030  0.034     identity
  2   0.959   9.91   0.99  132222  0.035 *** 2-fold l ( 0 0 1)  {-h,-k,+l}
  3   0.958   9.87   0.99  110073  0.044 *** 2-fold h ( 1 0 0)  {+h,-k,-l}
  4   0.942   9.55   0.96  132646  0.109 *** 2-fold   ( 1 1 0)  {+k,+h,-l}
  5   0.958   9.87   0.99  111819  0.043 *** 2-fold k ( 0 1 0)  {-h,+k,-l}
  6   0.941   9.54   0.95  131842  0.109 *** 2-fold   ( 1-1 0)  {-k,-h,-l}
  7   0.937   9.50   0.95  224393  0.107 *** 4-fold l ( 0 0 1)  {-k,+h,+l} {+k,-h,+l}

and

    Laue Group        Lklhd   NetZc  Zc+   Zc-    CC    CC-  Rmeas   R-  Delta ReindexOperator

> 1  P 4/m m m  ***  1.000   9.73  9.73  0.00   0.97  0.00   0.07  0.00   0.2 [h,k,l]
- 2    P m m m       0.000   0.35  9.88  9.53   0.99  0.95   0.04  0.11   0.0 [h,k,l]
  3    C m m m       0.000  -0.02  9.72  9.74   0.97  0.97   0.07  0.07   0.2 [h+k,-h+k,l]
  4      P 4/m       0.000   0.07  9.77  9.70   0.98  0.97   0.06  0.08   0.2 [h,k,l]
  5  P 1 2/m 1       0.000   0.25  9.91  9.66   0.99  0.97   0.03  0.08   0.0 [-h,-l,-k]
  6  P 1 2/m 1       0.000   0.22  9.89  9.67   0.99  0.97   0.04  0.08   0.0 [h,k,l]
  7  P 1 2/m 1       0.000   0.21  9.88  9.67   0.99  0.97   0.04  0.08   0.0 [-k,-h,-l]
  8  C 1 2/m 1       0.000  -0.01  9.72  9.73   0.97  0.97   0.07  0.07   0.2 [h-k,h+k,l]
  9  C 1 2/m 1       0.000  -0.02  9.71  9.73   0.97  0.97   0.07  0.07   0.2 [h+k,-h+k,l]
 10       P -1       0.000   0.21  9.91  9.70   0.99  0.97   0.03  0.08   0.0 [h,k,l]

and

   Spacegroup         TotProb SysAbsProb     Reindex         Conditions
 
   <P 41 21 2> ( 92)    0.823  0.823                         00l: l=4n, h00: h=2n (zones 1,2)
   <P 43 21 2> ( 96)    0.823  0.823                         00l: l=4n, h00: h=2n (zones 1,2)
    ..........
    <P 4 21 2> ( 90)    0.095  0.095                         h00: h=2n (zone 2)
    ..........
   <P 42 21 2> ( 94)    0.077  0.077                         00l: l=2n, h00: h=2n (zones 1,2)

Thus suggesting #92 or #96 - the latter of which agrees with the PDB deposition. However, running CORRECT in #96 and specifying 103 103 130 90 90 90 as cell parameters, we obtain:

REFINED PARAMETERS:  DISTANCE BEAM ORIENTATION CELL AXIS                   
USING  220320 INDEXED SPOTS
STANDARD DEVIATION OF SPOT    POSITION (PIXELS)     1.17
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES)    0.14
CRYSTAL MOSAICITY (DEGREES)     0.191
DIRECT BEAM COORDINATES (REC. ANGSTROEM)  -0.004790  0.004009  1.021014
DETECTOR COORDINATES (PIXELS) OF DIRECT BEAM    1027.19   1064.23
DETECTOR ORIGIN (PIXELS) AT                     1036.79   1056.20
CRYSTAL TO DETECTOR DISTANCE (mm)       209.52
LAB COORDINATES OF DETECTOR X-AXIS  1.000000  0.000000  0.000000
LAB COORDINATES OF DETECTOR Y-AXIS  0.000000  1.000000  0.000000
LAB COORDINATES OF ROTATION AXIS  0.999996  0.000901  0.002534
COORDINATES OF UNIT CELL A-AXIS    21.926    53.087    85.553
COORDINATES OF UNIT CELL B-AXIS     3.794    87.060   -54.995
COORDINATES OF UNIT CELL C-AXIS  -128.212    18.926    21.115
REC. CELL PARAMETERS   0.009704  0.009704  0.007616  90.000  90.000  90.000
UNIT CELL PARAMETERS    103.045   103.045   131.310  90.000  90.000  90.000
E.S.D. OF CELL PARAMETERS  2.1E-01 2.1E-01 2.1E-01 0.0E+00 0.0E+00 0.0E+00
SPACE GROUP NUMBER     96

...

    a        b          ISa
7.890E+00  8.793E-04   12.01

...

     NOTE:      Friedel pairs are treated as different reflections.

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    6.23       16770    2983      3017       98.9%       5.2%      6.1%    16752   26.20     5.7%     2.6%    55%   1.247    1223
    4.43       30598    5392      5393      100.0%       5.8%      6.2%    30596   25.25     6.3%     3.0%    50%   1.072    2420
    3.62       39822    6992      6994      100.0%       6.9%      6.6%    39820   22.27     7.6%     4.0%    32%   0.975    3215
    3.14       49620    8240      8242      100.0%       9.2%      8.7%    49619   17.14    10.1%     6.2%    19%   0.876    3847
    2.81       59388    9379      9379      100.0%      17.7%     18.1%    59387   10.44    19.3%    12.3%     0%   0.736    4410
    2.56       65652   10308     10310      100.0%      34.6%     39.1%    65652    6.08    37.7%    23.6%    -1%   0.680    4872
    2.37       71744   11258     11259      100.0%      71.3%     83.8%    71744    3.23    77.6%    52.1%    -2%   0.652    5352
    2.22       74888   12065     12082       99.9%     111.0%    116.9%    74888    1.98   121.2%    86.9%     2%   0.718    5753
    2.09       65727   12386     12874       96.2%     151.3%    176.1%    65517    1.12   168.0%   148.4%    -3%   0.631    5797
   total      474209   79003     79550       99.3%      10.3%     11.0%   473975    9.44    11.3%    17.2%    13%   0.772   36889


NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES  492346
NUMBER OF REJECTED MISFITS                           17898
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                141
NUMBER OF ACCEPTED OBSERVATIONS                     474307
NUMBER OF UNIQUE ACCEPTED REFLECTIONS                79022

which is much worse than the spacegroup 19 statistics (compare the ISa values - they differ by a factor of 2 !) so there may be something wrong with some assumptions we were making ...

Identifying a possible cause

The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a jiffy called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3): 1y13-e1-scales.png

demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers).

It should be noted that any abrupt change in conditions during the experiment is going to spoil the resulting data in one way or another. This is most true for a SAD experiment which is supposed to give accurate values for the tiny differences in intensities between Friedel-related reflections.

A solution

At this point it is good to look at the data for experiment E2. Here, we find exactly the same problems of bad ISa and high "STANDARD DEVIATION OF SPOT POSITION (PIXELS)" when reducing frames 1-591 in one run of xds.

With this knowledge, we are lead, for E1, to reduce frames 1-372 and 373-592 separately, in spacegroup 96. For E2, we use frames 1-369 and 371-591, respectively. Frame E2-370 has a very high scale factor so we leave it out altogether.

This is also a good time to closely inspect the headers of the frames:

% grep --binary-files=text DATE j1603b3PK_1_E1_37?.img

gives

j1603b3PK_1_E1_370.img:DATE=Sun Jun 27 08:55:51 2004;
j1603b3PK_1_E1_371.img:DATE=Sun Jun 27 08:56:00 2004;
j1603b3PK_1_E1_372.img:DATE=Sun Jun 27 08:56:08 2004;
j1603b3PK_1_E1_373.img:DATE=Sun Jun 27 09:19:45 2004;
j1603b3PK_1_E1_374.img:DATE=Sun Jun 27 09:19:54 2004;
j1603b3PK_1_E1_375.img:DATE=Sun Jun 27 09:20:02 2004;
j1603b3PK_1_E1_376.img:DATE=Sun Jun 27 09:20:10 2004;
j1603b3PK_1_E1_377.img:DATE=Sun Jun 27 09:20:58 2004;
j1603b3PK_1_E1_378.img:DATE=Sun Jun 27 09:21:08 2004;
j1603b3PK_1_E1_379.img:DATE=Sun Jun 27 09:21:17 2004;

and

% grep --binary-files=text DATE j1603b3PK_1_E2_3[67]?.img

gives

j1603b3PK_1_E2_366.img:DATE=Sun Jun 27 08:55:15 2004;
j1603b3PK_1_E2_367.img:DATE=Sun Jun 27 08:55:23 2004;
j1603b3PK_1_E2_368.img:DATE=Sun Jun 27 08:55:32 2004;
j1603b3PK_1_E2_369.img:DATE=Sun Jun 27 08:56:19 2004;
j1603b3PK_1_E2_370.img:DATE=Sun Jun 27 08:56:28 2004;
j1603b3PK_1_E2_371.img:DATE=Sun Jun 27 09:19:26 2004;
j1603b3PK_1_E2_372.img:DATE=Sun Jun 27 09:19:34 2004;
j1603b3PK_1_E2_373.img:DATE=Sun Jun 27 09:20:22 2004;
j1603b3PK_1_E2_374.img:DATE=Sun Jun 27 09:20:30 2004;
j1603b3PK_1_E2_375.img:DATE=Sun Jun 27 09:20:38 2004;
j1603b3PK_1_E2_376.img:DATE=Sun Jun 27 09:20:47 2004;

thus proving that both datasets were interrupted for 20 minutes around frame 370.

The really weird thing here is that both datasets appear to be collected at the same time, but at different wavelengths (E1 at 0.9794 Å, E2 at 0.9184 Å), and yet the individual parts merge as follows: using the following XSCALE.INP:

UNIT_CELL_CONSTANTS=103.316   103.316   131.456  90.000  90.000  90.000
SPACE_GROUP_NUMBER=96
OUTPUT_FILE=temp.ahkl
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL
INPUT_FILE=../e1_373-592/XDS_ASCII.HKL
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL
INPUT_FILE=../e2_371-591/XDS_ASCII.HKL

and running xscale, we obtain in XSCALE.LP:

    CORRELATIONS BETWEEN INPUT DATA SETS AFTER CORRECTIONS

DATA SETS  NUMBER OF COMMON  CORRELATION   RATIO OF COMMON   B-FACTOR
 #i   #j     REFLECTIONS     BETWEEN i,j  INTENSITIES (i/j)  BETWEEN i,j

   1    2       15943           0.978            1.0002         0.0106
   1    3       22366           1.000            1.0012        -0.0008
   2    3       15801           0.977            0.9983         0.0557
   1    4       15648           0.979            0.9988         0.0541
   2    4       14862           0.999            1.0024        -0.0007
   3    4       15524           0.978            0.9999        -0.0015

which means that e1_1-372 correlates well (1.000) with e2_1-369, and e1_373-59 well (0.999) with e2_371-591, but the crosswise correlations are consistently low (0.978, 0.977, 0.979, 0.978). The adjustment to the error model proves this:

    a        b          ISa    ISa0   INPUT DATA SET
6.112E+00  1.429E-03   10.70   22.37 ../e1_1-372/XDS_ASCII.HKL                         
1.074E+01  1.825E-03    7.14   23.79 ../e1_373-592/XDS_ASCII.HKL                       
5.707E+00  1.621E-03   10.40   22.82 ../e2_1-369/XDS_ASCII.HKL                         
8.547E+00  1.796E-03    8.07   24.17 ../e2_371-591/XDS_ASCII.HKL                       

telling us that "if we merge these datasets together, their error estimates have to be increased a lot". However, if we switch to

UNIT_CELL_CONSTANTS=103.316   103.316   131.456  90.000  90.000  90.000
SPACE_GROUP_NUMBER=96

OUTPUT_FILE=firstparts.ahkl
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL

OUTPUT_FILE=secondparts.ahkl
INPUT_FILE=../e1_373-592/XDS_ASCII.HKL
INPUT_FILE=../e2_371-591/XDS_ASCII.HKL

we obtain

    a        b          ISa    ISa0   INPUT DATA SET
6.120E+00  3.673E-04   21.09   22.37 ../e1_1-372/XDS_ASCII.HKL                         
5.713E+00  3.819E-04   21.41   22.82 ../e2_1-369/XDS_ASCII.HKL                         
5.639E+00  3.151E-04   23.72   23.79 ../e1_373-592/XDS_ASCII.HKL                       
5.289E+00  3.258E-04   24.09   24.17 ../e2_371-591/XDS_ASCII.HKL                       

proving that the second parts of datasets E1 and E2 should be treated separately from the first parts.

Upon inspection of the cell parameters, we find that the cell axes of the second "halfs" are shorter by a factor of 0.9908 when compared with the first parts. This suggests that they were collected at a longer wavelength! But then the wavelength values in the headers are most likely completely wrong: we can speculate that the two first parts were collected at the SeMet peak wavelength, and the two second parts at the inflection wavelength.

The almost-simultaneous DATEs in the headers may be explained by an inverse-beam measuring strategy which alternatingly collects 4 frames in one orientation as E1, then rotates the spindle by 180° and collects 4 frames into E2. For some reason, the beamline software did not write the correct wavelength into the headers.

So this little detective work appears to tell us what happened in the morning of Sunday June 27, 2004 at ALS beamline 821.

Further analysis of datasets E1 and E2

Here, we try to learn more about the constituents of "firstparts".

Running "xdsstat > XDSSTAT.LP" in the e1_1-372 and e2_1-369 directories, we obtain statistics output not available from CORRECT. We open XDSSTAT.LP with the CCP4 program "loggraph", and take a look at misfits.pck, rf.pck, and the other files produced by xdsstat, using VIEW or XDS-Viewer:

E1 1-372-xdsstat1.png

Reflections and misfits, by frame - looks normal

E1 1-372-xdsstat2.png

Intensity and sigma by frame - looks normal

E1 1-372-xdsstat3.png

"partiality" and profile agreement, by frame - looks good but it's clear that the profiles at high frame number agree worse with the average profiles, possibly due to radiation damage

E1 1-372-xdsstat4.png

R_meas, by frame, clearly showing good R_meas in the middle of the dataset

E1 1-372-xdsstat-raddam.png

R_d - an R-factor which directly depends on radiation damage. This is calculated as a function of frame number difference and the linear rise indicates significant radiation damage that should be correctable in XSCALE, using the CRYSTAL_NAME keyword.

E1 1-372-misfits.png

misfits mapped on the detector, showing ice rings.

E1 1-372-rf.png

R_meas mapped on the detector, showing elevated R_meas at the location of the ice rings.

Solving the structure

Although we could now think of using these two files ("firstparts" and "secondparts" merged) and assume that they are peak and inflection wavelengths, it appears more reasonable to try and solve the structure with SAD - which means using "firstparts" only.

Let's look at the XSCALE statistics for "firstparts":

      NOTE:      Friedel pairs are treated as different reflections.
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    9.40        6122     844       883       95.6%       2.9%      3.5%     6111   54.76     3.2%     1.4%    79%   2.137     313
    6.64       12037    1611      1621       99.4%       2.9%      3.6%    12035   51.54     3.1%     1.5%    80%   2.259     684
    5.43       15348    2065      2086       99.0%       3.5%      3.7%    15347   47.79     3.7%     1.7%    78%   2.294     908
    4.70       18714    2487      2498       99.6%       3.0%      3.7%    18711   49.55     3.2%     1.5%    72%   1.712    1120
    4.20       21104    2797      2821       99.1%       3.1%      3.7%    21102   47.24     3.3%     1.7%    72%   1.727    1271
    3.84       23316    3095      3117       99.3%       3.8%      4.0%    23313   42.74     4.1%     2.1%    65%   1.617    1420
    3.55       25693    3345      3366       99.4%       4.4%      4.5%    25693   37.93     4.7%     2.6%    50%   1.411    1548
    3.32       28017    3633      3653       99.5%       5.2%      5.2%    28015   32.89     5.6%     3.6%    40%   1.335    1687
    3.13       30266    3842      3848       99.8%       7.2%      7.2%    30264   25.87     7.7%     4.8%    36%   1.158    1797
    2.97       32595    4114      4118       99.9%      10.4%     10.4%    32594   19.26    11.1%     7.7%    30%   1.068    1925
    2.83       34384    4315      4320       99.9%      14.3%     14.8%    34382   14.88    15.3%    10.3%    20%   0.937    2031
    2.71       35654    4475      4478       99.9%      18.3%     19.1%    35652   12.13    19.5%    13.1%    15%   0.891    2110
    2.61       37307    4705      4710       99.9%      27.5%     28.8%    37304    8.44    29.4%    19.8%    11%   0.834    2224
    2.51       38997    4893      4896       99.9%      35.5%     38.0%    38997    6.78    38.0%    26.0%    10%   0.817    2318
    2.43       40036    5026      5027      100.0%      51.3%     55.1%    40032    4.92    54.8%    38.0%     2%   0.738    2387
    2.35       39975    5180      5222       99.2%      71.3%     68.9%    39967    3.78    76.4%    52.7%    21%   0.887    2446
    2.28       42041    5385      5423       99.3%      93.7%     93.1%    42037    2.90   100.3%    66.7%    11%   0.798    2548
    2.21       43012    5538      5541       99.9%      85.7%     88.3%    43011    2.87    91.8%    58.8%    10%   0.818    2644
    2.16       42610    5701      5703      100.0%     113.6%    120.7%    42607    2.13   122.0%    85.4%     4%   0.722    2724
    2.10       38996    5634      5912       95.3%     146.1%    153.9%    38944    1.50   157.8%   122.7%     3%   0.711    2639
   total      606224   78685     79243       99.3%       6.7%      7.2%   606118   16.88     7.2%    12.0%    29%   1.055   36744

The anomalous correlation is good at low resolution, though not outstanding. At high resolution it rises again but this is presumably due to the ice rings.