Simulated-1g1c: Difference between revisions

no edit summary
(Created page with "This is an exercise, devised by James Holton, which deals with merging of datasets that were obtained in the presence of radiation damage. The datasets are actually simulated us...")
 
No edit summary
Line 1: Line 1:
This is an exercise, devised by James Holton, which deals with merging of datasets that were obtained in the presence of radiation damage.
This is an exercise, devised by James Holton, which deals with merging of datasets that were obtained in the presence of strong radiation damage.


The datasets are actually simulated using his program MLFSOM. There are 100 of them, and they are in random orientations wrt each other. Each dataset consists of 15 frames of 1 degree rotation. The first frame looks good (diffraction to 2 A), the last bad (10 A).
The datasets were actually simulated using his program MLFSOM. There are 100 of them, and they are in random orientations wrt each other. Each dataset consists of 15 frames of 1 degree rotation.  


The goal of data processing is to obtain a good and complete dataset. In this case, it is tempting to think about the possibility of only using the first frame of each dataset. This has two advantages:
The goal of data processing is to obtain a good and complete dataset. In this case, it is tempting to think about the possibility of only using the first frame of each dataset. This has three advantages:
# the resolution is best
# radiation damage does not lower the resolution
# the completeness should be adequate
# the completeness should be adequate if the symmetry is at least orthorhombic
# this could be a model for processing data from a X-ray Free Electron Laser (see the recent Nature paper at [http://www.nature.com/nature/journal/v470/n7332/abs/nature09750.html])  
# a successful procedure could also serve for processing data from a X-ray Free Electron Laser (see the recent Nature paper at [http://www.nature.com/nature/journal/v470/n7332/abs/nature09750.html])  


== Preparation ==
== Preparation ==


We have to get some idea about possible spacegroups first. This means that one of the datasets needs to be processed. Let's choose "xtal100", the last one.
From visual inspection (using [[ADXV]]) we realize that the first frame of each dataset looks good (diffraction to 2 A), the last bad (10 A), and there is an obvious degradation from each frame to the next.


  generate_XDS.INP "../../Illuin/microfocus/xtal100_1_0??.img"
We have to get some idea about possible spacegroups first. This means processing some of the datasets. Let's choose "xtal100", the last one.
 
  generate_XDS.INP "../../Illuin/microfocus/xtal100_1_001.img"
 
To maximize the number of reflections that should be used for spacegroup determination, the only changes to XDS.INP are:
TEST_RESOLUTION_RANGE= 50 0 ! default is 10 4
DATA_RANGE= 1 1            ! R-factors involving more than 1 frame are meaningless
                            ! with such strong radiation damage
 
We run "xds" and, after a few seconds, can inspect IDXREF.LP and CORRECT.LP. It turns out the primitve cell is 38.3, 79.2, 79.2, 90, 90, 90 which is compatible with tetragonal spacegroups, or those with lower symmetry:
 
  LATTICE-  BRAVAIS-  QUALITY  UNIT CELL CONSTANTS (ANGSTROEM & DEGREES)    REINDEXING TRANSFORMATION
CHARACTER  LATTICE    OF FIT      a      b      c  alpha  beta gamma
*  31        aP          0.0      38.3  79.2  79.2  90.0  90.0  90.0    1  0  0  0  0  1  0  0  0  0  1  0
*  44        aP          0.1      38.3  79.2  79.2  90.0  90.0  90.0  -1  0  0  0  0 -1  0  0  0  0  1  0
*  35        mP          0.4      79.2  38.3  79.2  90.0  90.0  90.0    0  1  0  0  1  0  0  0  0  0 -1  0
*  33        mP          0.9      38.3  79.2  79.2  90.0  90.0  90.0  -1  0  0  0  0 -1  0  0  0  0  1  0
*  34        mP          1.1      38.3  79.2  79.2  90.0  90.0  90.0    1  0  0  0  0  0 -1  0  0  1  0  0
*  32        oP          1.2      38.3  79.2  79.2  90.0  90.0  90.0  -1  0  0  0  0 -1  0  0  0  0  1  0
*  20        mC          1.2    112.0  111.9  38.3  90.0  90.0  90.0    0  1  1  0  0  1 -1  0 -1  0  0  0
*  23        oC          1.4    111.9  112.0  38.3  90.0  90.0  90.0    0 -1  1  0  0  1  1  0 -1  0  0  0
*  25        mC          1.4    111.9  112.0  38.3  90.0  90.0  90.0    0 -1  1  0  0  1  1  0 -1  0  0  0
*  21        tP          2.2      79.2  79.2  38.3  90.0  90.0  90.0    0 -1  0  0  0  0  1  0 -1  0  0  0
    37        mC        249.8    162.9  38.3  79.2  90.0  90.0  76.4  -1  0  2  0 -1  0  0  0  0 -1  0  0
 
This table exists in both IDXREF.LP and CORRECT.LP. The next table in CORRECT.LP tells us the Rmeas of the starred (*) lattices:
 
SPACE-GROUP        UNIT CELL CONSTANTS            UNIQUE  Rmeas  COMPARED  LATTICE-
  NUMBER      a      b      c  alpha beta gamma                            CHARACTER
      5    112.0  111.9  38.3  90.0  90.0  90.0    973    0.0        0    20 mC
      75      79.2  79.2  38.3  90.0  90.0  90.0    961    93.5      12    21 tP
      89      79.2  79.2  38.3  90.0  90.0  90.0    946    30.9      27    21 tP
      21    111.9  112.0  38.3  90.0  90.0  90.0    965    31.6        8    23 oC
      5    111.9  112.0  38.3  90.0  90.0  90.0    970    77.9        3    25 mC
      1      38.3  79.2  79.2  90.0  90.0  90.0    973    0.0        0    31 aP
      16      38.3  79.2  79.2  90.0  90.0  90.0    954    6.8      19    32 oP
      3      79.2  38.3  79.2  90.0  90.0  90.0    968    5.4        5    35 mP
      3      38.3  79.2  79.2  90.0  90.0  90.0    966    5.2        7    33 mP
      3      38.3  79.2  79.2  90.0  90.0  90.0    966    10.7        7    34 mP
      1      38.3  79.2  79.2  90.0  90.0  90.0    973    0.0        0    44 aP
 
Obviously the tetragonal lattices seem unfavourable, whereas orthorhombic is good. We repeat this procedure with a few other datasets, and observe that the "orthorhombic hypothesis" is confirmed. E.g. with xtal001 we obtain:
 
SPACE-GROUP        UNIT CELL CONSTANTS            UNIQUE  Rmeas  COMPARED  LATTICE-
  NUMBER      a      b      c  alpha beta gamma                            CHARACTER
      5    111.9  111.9  38.3  90.0  90.0  90.0    939  119.8        5    20 mC
      75      79.1  79.1  38.3  90.0  90.0  90.0    939    47.0        5    21 tP
      89      79.1  79.1  38.3  90.0  90.0  90.0    865    21.6      79    21 tP
      21    111.9  111.9  38.3  90.0  90.0  90.0    939  119.8        5    23 oC
      5    111.9  111.9  38.3  90.0  90.0  90.0    939  119.8        5    25 mC
      1      38.3  79.1  79.1  90.0  90.0  90.0    944    0.0        0    31 aP
      16      38.3  79.1  79.1  90.0  90.0  90.0    875    6.3      69    32 oP
      3      79.1  38.3  79.1  90.0  90.0  90.0    944    0.0        0    35 mP
      3      38.3  79.1  79.1  90.0  90.0  90.0    875    6.3      69    33 mP
      3      38.3  79.1  79.1  90.0  90.0  90.0    944    0.0        0    34 mP
      1      38.3  79.1  79.1  90.0  90.0  90.0    944    0.0        0    44 aP
== devising a bootstrap procedure ==
 
We have to realize that, since the b and c axes are equal, we can index each dataset in two non-equivalent ways. This is the same situation as would occur e.g. for spacegroups P3(x) and P4(x), and means that we'll have to use a REFERENCE_DATA_SET to
2,652

edits