2QVO.xds: Difference between revisions

2QVO.xds (view source)

Revision as of 21:22, 14 March 2011

5,679 bytes added , 14 March 2011

no edit summary

Kay

Bureaucrats

2,652

edits

@@ Line 69: / Line 69: @@
   *  21        tP          7.3      53.5   53.5   41.2  90.1  90.1  90.3    0  1  0  0  0  0 -1  0 -1  0  0  0
         mC        249.8     114.5   41.2   53.5  90.1  90.3  69.0    1 -2  0  0  1  0  0  0  0  0  1  0
-and further down lists
+indicating at most tetragonal symmetry, shortly after this calculates R-factors for these lattices:
   SPACE-GROUP         UNIT CELL CONSTANTS            UNIQUE   Rmeas  COMPARED  LATTICE-
     NUMBER      a      b      c   alpha beta gamma                            CHARACTER
@@ Line 111: / Line 111: @@
 After his comes the table that tells us the quality of our data:
         NOTE:      Friedel pairs are treated as different reflections.
@@ Line 127: / Line 128: @@
 .04        5134    1601      2347       68.2%     274.7%    291.2%     4913    0.40   325.5%   400.7%     1%   0.608     606
      total       91819   13782     14656       94.0%       5.7%      5.9%    91589   20.24     6.2%    15.0%    12%   0.897    6450
+ NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES   93217
+ NUMBER OF REJECTED MISFITS                            1391
+ NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
+ NUMBER OF ACCEPTED OBSERVATIONS                      91826
+ NUMBER OF UNIQUE ACCEPTED REFLECTIONS                13784
 So the anomalous signal goes to about 3.3 A (which is where 30% would be, in the "Anomal Corr" column), and the useful resolution goes to 2.16 A, I'd say (pls note that this table treats Friedels separately; merging them increases I/sigma by another factor of 1.41).
+For the sake of comparability, from now on we use the same axes (53.03 53.03 40.97) as the deposited PDB id 2QVO.
 We could now modify XDS.INP to have
   JOB=CORRECT  ! not XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
   SPACE_GROUP_NUMBER=   77
-  UNIT_CELL_CONSTANTS=    53.10    53.10    40.90  90.000  90.000  90.000
+  UNIT_CELL_CONSTANTS=    53.03   53.03  40.97  90.000  90.000  90.000
 and run xds again, to obtain the final CORRECT.LP and XDS_ASCII.HKL with the correct spacegroup, but the statistics in 75 and 77 are the same, for all practical purposes (the 8 reflections known to be extinct do not make much difference).
 Following this, we create XDSCONV.INP with the lines
   SPACE_GROUP_NUMBER=   77  ! can leave out if CORRECT already ran in #77
-  UNIT_CELL_CONSTANTS=  53.10 53.10 40.90 90 90 90 ! same here
+  UNIT_CELL_CONSTANTS=  53.03   53.03  40.97 90 90 90 ! same here
   INPUT_FILE=XDS_ASCII.HKL
   OUTPUT_FILE=temp.hkl CCP4
@@ Line 153: / Line 164: @@
 ===dataset 2===
-This works exactly the same way as dataset 1.
+This works exactly the same way as dataset 1. The table in CORRECT.LP is
+       NOTE:      Friedel pairs are treated as different reflections.
+ SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
+ RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
+   LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr
+.06        3925     547       560       97.7%       3.0%      3.3%     3922   56.13     3.3%     1.4%    80%   1.874     242
+.31        7498    1000      1000      100.0%       2.8%      3.4%     7498   56.91     3.0%     1.2%    65%   1.473     469
+.53        9407    1291      1291      100.0%       3.4%      3.5%     9407   52.39     3.7%     1.6%    55%   1.276     616
+.06       11005    1526      1526      100.0%       4.1%      3.9%    11005   42.13     4.4%     2.2%    39%   1.211     732
+.74       12569    1701      1701      100.0%       5.7%      5.7%    12569   28.38     6.1%     3.7%     4%   0.881     822
+.50       14020    1904      1904      100.0%       9.0%      9.9%    14020   17.92     9.7%     6.3%     3%   0.741     921
+.31       15101    2081      2081      100.0%      17.0%     19.0%    15101    9.83    18.3%    12.7%    -5%   0.682    1011
+.16       11693    2080      2202       94.5%      39.4%     40.8%    11682    4.00    43.6%    45.8%    10%   0.791    1003
+.04        5152    1607      2345       68.5%      85.6%     93.5%     4943    1.21   101.3%   129.6%    10%   0.718     615
+    total       90370   13737     14610       94.0%       4.2%      4.5%    90147   24.22     4.6%     7.3%    22%   0.956    6431
+ NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES   92690
+ NUMBER OF REJECTED MISFITS                            2318
+ NUMBER OF SYSTEMATIC ABSENT REFLECTIONS                  0
+ NUMBER OF ACCEPTED OBSERVATIONS                      90372
+ NUMBER OF UNIQUE ACCEPTED REFLECTIONS                13738
+Dataset 2 is definitively better than dataset 1.
 ==SHELXC/D/E structure solution==
-This is done in a subdirectory of the XDS data reduction directory (either dataset "1" or "2", and we can also try it in a xscale subdirectory). Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]]:
+This is done in a subdirectory of the XDS data reduction directory (either dataset "1" or "2", and we can also try it in a xscale subdirectory). Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]].
 <pre>
 #!/bin/csh -f
@@ Line 172: / Line 209: @@
 shelxc j <<end
 SAD   temp.hkl
-CELL 53.10 53.10 40.90 90 90 90
+CELL 53.03 53.03 40.97 90 90 90
 SPAG P42
 MAXM 2
 end
+</pre>
 This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now (these lines are just the ones that [[ccp4com:hkl2map|hkl2map]] would write):
 <pre>
 cat > j_fa.ins <<end
 TITL j_fa.ins SAD in P42
-CELL  0.98000   53.10   53.10   40.90   90.00   90.00   90.00
+CELL  0.98000  53.03   53.03  40.97   90.00   90.00   90.00
 LATT  -1
 SYMM -Y, X, 1/2+Z
@@ Line 203: / Line 240: @@
   shelxd j_fa
-This gives best CC All/Weak of 36.74 / 21.55 for dataset 1, and best CC All/Weak of 35.61 / 26.03 for dataset 2, and .
+This gives best CC All/Weak of 37.28 / 21.38 for dataset 1, and best CC All/Weak of 37.89 / 23.80 for dataset 2, and .
-Next we run G. Sheldrick's beta-Version of [[ccp4com:SHELX_C/D/E|SHELXE]] Version 2009/4:
+Next we run G. Sheldrick's beta-Version of [[ccp4com:SHELX_C/D/E|SHELXE]] Version 2011/1:
   shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b
@@ Line 211: / Line 248: @@
   shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b -i
-One of these solves the structure, the other gives bad statistics.
+One of these (and it's impossible to predict which one!) solves the structure, the other gives bad statistics.
 Some important lines in the output: for dataset 1, I get
+residues left after pruning, divided into chains as follows:
+ A:  78
+ CC for partial structure against native data =  36.54 %
+ ...
+ Estimated mean FOM and mapCC as a function of resolution
+ d    inf - 4.49 - 3.55 - 3.10 - 2.81 - 2.61 - 2.45 - 2.32 - 2.22 - 2.13 - 2.03
+ <FOM>   0.763  0.784  0.743  0.682  0.632  0.620  0.621  0.600  0.519  0.416
+ <mapCC> 0.890  0.936  0.916  0.893  0.838  0.827  0.847  0.858  0.836  0.768
+ N         721    728    722    720    719    738    749    721    674    721
+ Estimated mean FOM = 0.639   Pseudo-free CC = 65.26 %
+ Density (in map sigma units) at input heavy atom sites
+  Site     x        y        z     occ*Z    density
+   0.0293   0.3394   0.3145  16.0000    19.09
+  -0.1598   0.3789   0.3723  12.7456    15.78
+  -0.1413   0.4707   0.3704   9.4720     7.85
+  -0.2238   0.1590   0.4520   9.2176     9.96
+   0.0387   0.4228   0.3134   1.6608     1.28
+ Site    x       y       z  h(sig) near old  near new
+  0.0293  0.3392  0.3148  19.1  1/0.02  2/10.34 4/11.66 4/11.66 5/12.88
+-0.1564  0.3740  0.3757  16.4  2/0.35  5/4.38 4/5.45 1/10.34 3/12.03
+-0.2146  0.1625  0.4495  11.0  4/0.53  2/12.03 5/15.84 1/16.92 4/17.39
+-0.1386  0.4748  0.3671   8.1  3/0.29  5/2.67 2/5.45 1/11.66 1/11.66
+-0.1829  0.4512  0.3605   5.9  3/2.47  4/2.67 2/4.38 1/12.88 1/13.92
+and for dataset 2,
+residues left after pruning, divided into chains as follows:
+ A:  80
+ ...
+ CC for partial structure against native data =  46.31 %
+ Estimated mean FOM and mapCC as a function of resolution
+ d    inf - 4.49 - 3.55 - 3.10 - 2.81 - 2.61 - 2.45 - 2.32 - 2.22 - 2.13 - 2.02
+ <FOM>   0.726  0.703  0.695  0.704  0.706  0.713  0.667  0.572  0.535  0.503
+ <mapCC> 0.850  0.863  0.857  0.899  0.900  0.908  0.866  0.805  0.828  0.814
+ N         719    721    725    719    713    736    755    722    673    705
+ Estimated mean FOM = 0.654   Pseudo-free CC = 67.40 %
+ Density (in map sigma units) at input heavy atom sites
+  Site     x        y        z     occ*Z    density
+   0.1613   0.5298   0.4706  16.0000    22.30
+   0.1266   0.3414   0.5281  14.4576    17.03
+   0.3453   0.2833   0.6078  11.1760    11.69
+   0.0318   0.3665   0.5267   6.6512     8.45
+   0.0499   0.3350   0.5280   5.8208     5.38
+ Site    x       y       z  h(sig) near old  near new
+  0.1605  0.5316  0.4699  22.4  1/0.11  2/10.61 4/11.62 4/11.62 5/12.61
+  0.1258  0.3407  0.5328  17.4  2/0.20  5/3.83 4/5.39 1/10.61 3/12.02
+  0.3367  0.2831  0.6107  13.2  3/0.47  2/12.02 5/15.41 1/17.15 4/17.33
+  0.0269  0.3630  0.5241   9.3  4/0.33  5/2.78 2/5.39 1/11.62 1/11.62
+  0.0575  0.3206  0.5182   8.2  5/0.95  4/2.78 2/3.83 1/12.61 1/14.10
 '''clearly indicating that the structure can be solved with each of the two datasets individually.'''
-==Optimization of data reduction==
+==Can we do better?==
+===data reduction===
 The safest way to optimize the data reduction is to look at external quality indicators. Internal R-factors, and even the correlation coefficient of the anomalous signal are of comparatively little value. A readily available external quality indicator is CC All/CC Weak as obtained by [[ccp4com:SHELX_C/D/E|SHELXD]].
@@ Line 229: / Line 325: @@
 [[Optimization]] does improve things as much as it often does: recycling of GXPARM.XDS to use as XPARM.XDS, and thus imposing the lattice symmetry in the geometry refinement in INTEGRATE. These findings my correspond to the fact that in P1 the angles do not refine to 90.0xx or 89.9xx degrees. In other words, the metric symmetry is not as well fulfilled as it should be. This results in fairly large deviations from the ideal P42 positions; the refinement of cell parameters in P1 partly compensates for this. I have however no idea why this deviation from metric symmetry could occur.
-==Optimization of structure solution==
+===structure solution===
-The resolution limit for SHELXD could be varied. For SHELXE, the solvent content could be varied, and the number of autobuilding cycles, and probably also the high resolution cutoff.
+The resolution limit for SHELXD could be varied. For SHELXE, the solvent content could be varied, and the number of autobuilding cycles, and probably also the high resolution cutoff. Furthermore, it would be advantageous to "re-cycle" the file j.hat to j_fa.res, since the heavy-atom sites from SHELXE are more accurate than those from SHELXD, as the phases derived from the poly-Ala traces are quite good (compare the density columns of the two consecutive heavy-atom lists!).
 ==Limits==
 With dataset 2, I tried to use 270 frames but could not solve the structure using the above SHELXC/D/E approach (not even with MAXIMUM_ERROR_OF_SPOT_POSITION=6.0). With 315 frames, it was possible.