2,684
edits
No edit summary |
No edit summary |
||
Line 69: | Line 69: | ||
* 21 tP 7.3 53.5 53.5 41.2 90.1 90.1 90.3 0 1 0 0 0 0 -1 0 -1 0 0 0 | * 21 tP 7.3 53.5 53.5 41.2 90.1 90.1 90.3 0 1 0 0 0 0 -1 0 -1 0 0 0 | ||
39 mC 249.8 114.5 41.2 53.5 90.1 90.3 69.0 1 -2 0 0 1 0 0 0 0 0 1 0 | 39 mC 249.8 114.5 41.2 53.5 90.1 90.3 69.0 1 -2 0 0 1 0 0 0 0 0 1 0 | ||
indicating at most tetragonal symmetry, shortly after this calculates R-factors for these lattices: | |||
SPACE-GROUP UNIT CELL CONSTANTS UNIQUE Rmeas COMPARED LATTICE- | SPACE-GROUP UNIT CELL CONSTANTS UNIQUE Rmeas COMPARED LATTICE- | ||
NUMBER a b c alpha beta gamma CHARACTER | NUMBER a b c alpha beta gamma CHARACTER | ||
Line 111: | Line 111: | ||
After his comes the table that tells us the quality of our data: | After his comes the table that tells us the quality of our data: | ||
NOTE: Friedel pairs are treated as different reflections. | NOTE: Friedel pairs are treated as different reflections. | ||
Line 127: | Line 128: | ||
2.04 5134 1601 2347 68.2% 274.7% 291.2% 4913 0.40 325.5% 400.7% 1% 0.608 606 | 2.04 5134 1601 2347 68.2% 274.7% 291.2% 4913 0.40 325.5% 400.7% 1% 0.608 606 | ||
total 91819 13782 14656 94.0% 5.7% 5.9% 91589 20.24 6.2% 15.0% 12% 0.897 6450 | total 91819 13782 14656 94.0% 5.7% 5.9% 91589 20.24 6.2% 15.0% 12% 0.897 6450 | ||
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 93217 | |||
NUMBER OF REJECTED MISFITS 1391 | |||
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 | |||
NUMBER OF ACCEPTED OBSERVATIONS 91826 | |||
NUMBER OF UNIQUE ACCEPTED REFLECTIONS 13784 | |||
So the anomalous signal goes to about 3.3 A (which is where 30% would be, in the "Anomal Corr" column), and the useful resolution goes to 2.16 A, I'd say (pls note that this table treats Friedels separately; merging them increases I/sigma by another factor of 1.41). | So the anomalous signal goes to about 3.3 A (which is where 30% would be, in the "Anomal Corr" column), and the useful resolution goes to 2.16 A, I'd say (pls note that this table treats Friedels separately; merging them increases I/sigma by another factor of 1.41). | ||
For the sake of comparability, from now on we use the same axes (53.03 53.03 40.97) as the deposited PDB id 2QVO. | |||
We could now modify XDS.INP to have | We could now modify XDS.INP to have | ||
JOB=CORRECT ! not XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT | JOB=CORRECT ! not XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT | ||
SPACE_GROUP_NUMBER= 77 | SPACE_GROUP_NUMBER= 77 | ||
UNIT_CELL_CONSTANTS= 53. | UNIT_CELL_CONSTANTS= 53.03 53.03 40.97 90.000 90.000 90.000 | ||
and run xds again, to obtain the final CORRECT.LP and XDS_ASCII.HKL with the correct spacegroup, but the statistics in 75 and 77 are the same, for all practical purposes (the 8 reflections known to be extinct do not make much difference). | and run xds again, to obtain the final CORRECT.LP and XDS_ASCII.HKL with the correct spacegroup, but the statistics in 75 and 77 are the same, for all practical purposes (the 8 reflections known to be extinct do not make much difference). | ||
Following this, we create XDSCONV.INP with the lines | Following this, we create XDSCONV.INP with the lines | ||
SPACE_GROUP_NUMBER= 77 ! can leave out if CORRECT already ran in #77 | SPACE_GROUP_NUMBER= 77 ! can leave out if CORRECT already ran in #77 | ||
UNIT_CELL_CONSTANTS= 53. | UNIT_CELL_CONSTANTS= 53.03 53.03 40.97 90 90 90 ! same here | ||
INPUT_FILE=XDS_ASCII.HKL | INPUT_FILE=XDS_ASCII.HKL | ||
OUTPUT_FILE=temp.hkl CCP4 | OUTPUT_FILE=temp.hkl CCP4 | ||
Line 153: | Line 164: | ||
===dataset 2=== | ===dataset 2=== | ||
This works exactly the same way as dataset 1. | This works exactly the same way as dataset 1. The table in CORRECT.LP is | ||
NOTE: Friedel pairs are treated as different reflections. | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
6.06 3925 547 560 97.7% 3.0% 3.3% 3922 56.13 3.3% 1.4% 80% 1.874 242 | |||
4.31 7498 1000 1000 100.0% 2.8% 3.4% 7498 56.91 3.0% 1.2% 65% 1.473 469 | |||
3.53 9407 1291 1291 100.0% 3.4% 3.5% 9407 52.39 3.7% 1.6% 55% 1.276 616 | |||
3.06 11005 1526 1526 100.0% 4.1% 3.9% 11005 42.13 4.4% 2.2% 39% 1.211 732 | |||
2.74 12569 1701 1701 100.0% 5.7% 5.7% 12569 28.38 6.1% 3.7% 4% 0.881 822 | |||
2.50 14020 1904 1904 100.0% 9.0% 9.9% 14020 17.92 9.7% 6.3% 3% 0.741 921 | |||
2.31 15101 2081 2081 100.0% 17.0% 19.0% 15101 9.83 18.3% 12.7% -5% 0.682 1011 | |||
2.16 11693 2080 2202 94.5% 39.4% 40.8% 11682 4.00 43.6% 45.8% 10% 0.791 1003 | |||
2.04 5152 1607 2345 68.5% 85.6% 93.5% 4943 1.21 101.3% 129.6% 10% 0.718 615 | |||
total 90370 13737 14610 94.0% 4.2% 4.5% 90147 24.22 4.6% 7.3% 22% 0.956 6431 | |||
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 92690 | |||
NUMBER OF REJECTED MISFITS 2318 | |||
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 | |||
NUMBER OF ACCEPTED OBSERVATIONS 90372 | |||
NUMBER OF UNIQUE ACCEPTED REFLECTIONS 13738 | |||
Dataset 2 is definitively better than dataset 1. | |||
==SHELXC/D/E structure solution== | ==SHELXC/D/E structure solution== | ||
This is done in a subdirectory of the XDS data reduction directory (either dataset "1" or "2", and we can also try it in a xscale subdirectory). Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]] | This is done in a subdirectory of the XDS data reduction directory (either dataset "1" or "2", and we can also try it in a xscale subdirectory). Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]]. | ||
<pre> | <pre> | ||
#!/bin/csh -f | #!/bin/csh -f | ||
Line 172: | Line 209: | ||
shelxc j <<end | shelxc j <<end | ||
SAD temp.hkl | SAD temp.hkl | ||
CELL 53. | CELL 53.03 53.03 40.97 90 90 90 | ||
SPAG P42 | SPAG P42 | ||
MAXM 2 | MAXM 2 | ||
end | end | ||
</pre> | |||
This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now (these lines are just the ones that [[ccp4com:hkl2map|hkl2map]] would write): | This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now (these lines are just the ones that [[ccp4com:hkl2map|hkl2map]] would write): | ||
<pre> | <pre> | ||
cat > j_fa.ins <<end | cat > j_fa.ins <<end | ||
TITL j_fa.ins SAD in P42 | TITL j_fa.ins SAD in P42 | ||
CELL 0.98000 | CELL 0.98000 53.03 53.03 40.97 90.00 90.00 90.00 | ||
LATT -1 | LATT -1 | ||
SYMM -Y, X, 1/2+Z | SYMM -Y, X, 1/2+Z | ||
Line 203: | Line 240: | ||
shelxd j_fa | shelxd j_fa | ||
This gives best CC All/Weak of | This gives best CC All/Weak of 37.28 / 21.38 for dataset 1, and best CC All/Weak of 37.89 / 23.80 for dataset 2, and . | ||
Next we run G. Sheldrick's beta-Version of [[ccp4com:SHELX_C/D/E|SHELXE]] Version | Next we run G. Sheldrick's beta-Version of [[ccp4com:SHELX_C/D/E|SHELXE]] Version 2011/1: | ||
shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b | shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b | ||
Line 211: | Line 248: | ||
shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b -i | shelxe.beta j j_fa -a -q -h -s0.55 -m20 -b -i | ||
One of these solves the structure, the other gives bad statistics. | One of these (and it's impossible to predict which one!) solves the structure, the other gives bad statistics. | ||
Some important lines in the output: for dataset 1, I get | Some important lines in the output: for dataset 1, I get | ||
78 residues left after pruning, divided into chains as follows: | |||
A: 78 | |||
CC for partial structure against native data = 36.54 % | |||
... | |||
Estimated mean FOM and mapCC as a function of resolution | |||
d inf - 4.49 - 3.55 - 3.10 - 2.81 - 2.61 - 2.45 - 2.32 - 2.22 - 2.13 - 2.03 | |||
<FOM> 0.763 0.784 0.743 0.682 0.632 0.620 0.621 0.600 0.519 0.416 | |||
<mapCC> 0.890 0.936 0.916 0.893 0.838 0.827 0.847 0.858 0.836 0.768 | |||
N 721 728 722 720 719 738 749 721 674 721 | |||
Estimated mean FOM = 0.639 Pseudo-free CC = 65.26 % | |||
Density (in map sigma units) at input heavy atom sites | |||
Site x y z occ*Z density | |||
1 0.0293 0.3394 0.3145 16.0000 19.09 | |||
2 -0.1598 0.3789 0.3723 12.7456 15.78 | |||
3 -0.1413 0.4707 0.3704 9.4720 7.85 | |||
4 -0.2238 0.1590 0.4520 9.2176 9.96 | |||
5 0.0387 0.4228 0.3134 1.6608 1.28 | |||
Site x y z h(sig) near old near new | |||
1 0.0293 0.3392 0.3148 19.1 1/0.02 2/10.34 4/11.66 4/11.66 5/12.88 | |||
2 -0.1564 0.3740 0.3757 16.4 2/0.35 5/4.38 4/5.45 1/10.34 3/12.03 | |||
3 -0.2146 0.1625 0.4495 11.0 4/0.53 2/12.03 5/15.84 1/16.92 4/17.39 | |||
4 -0.1386 0.4748 0.3671 8.1 3/0.29 5/2.67 2/5.45 1/11.66 1/11.66 | |||
5 -0.1829 0.4512 0.3605 5.9 3/2.47 4/2.67 2/4.38 1/12.88 1/13.92 | |||
and for dataset 2, | |||
80 residues left after pruning, divided into chains as follows: | |||
A: 80 | |||
... | |||
CC for partial structure against native data = 46.31 % | |||
Estimated mean FOM and mapCC as a function of resolution | |||
d inf - 4.49 - 3.55 - 3.10 - 2.81 - 2.61 - 2.45 - 2.32 - 2.22 - 2.13 - 2.02 | |||
<FOM> 0.726 0.703 0.695 0.704 0.706 0.713 0.667 0.572 0.535 0.503 | |||
<mapCC> 0.850 0.863 0.857 0.899 0.900 0.908 0.866 0.805 0.828 0.814 | |||
N 719 721 725 719 713 736 755 722 673 705 | |||
Estimated mean FOM = 0.654 Pseudo-free CC = 67.40 % | |||
Density (in map sigma units) at input heavy atom sites | |||
Site x y z occ*Z density | |||
1 0.1613 0.5298 0.4706 16.0000 22.30 | |||
2 0.1266 0.3414 0.5281 14.4576 17.03 | |||
3 0.3453 0.2833 0.6078 11.1760 11.69 | |||
4 0.0318 0.3665 0.5267 6.6512 8.45 | |||
5 0.0499 0.3350 0.5280 5.8208 5.38 | |||
Site x y z h(sig) near old near new | |||
1 0.1605 0.5316 0.4699 22.4 1/0.11 2/10.61 4/11.62 4/11.62 5/12.61 | |||
2 0.1258 0.3407 0.5328 17.4 2/0.20 5/3.83 4/5.39 1/10.61 3/12.02 | |||
3 0.3367 0.2831 0.6107 13.2 3/0.47 2/12.02 5/15.41 1/17.15 4/17.33 | |||
4 0.0269 0.3630 0.5241 9.3 4/0.33 5/2.78 2/5.39 1/11.62 1/11.62 | |||
5 0.0575 0.3206 0.5182 8.2 5/0.95 4/2.78 2/3.83 1/12.61 1/14.10 | |||
'''clearly indicating that the structure can be solved with each of the two datasets individually.''' | '''clearly indicating that the structure can be solved with each of the two datasets individually.''' | ||
== | ==Can we do better?== | ||
===data reduction=== | |||
The safest way to optimize the data reduction is to look at external quality indicators. Internal R-factors, and even the correlation coefficient of the anomalous signal are of comparatively little value. A readily available external quality indicator is CC All/CC Weak as obtained by [[ccp4com:SHELX_C/D/E|SHELXD]]. | The safest way to optimize the data reduction is to look at external quality indicators. Internal R-factors, and even the correlation coefficient of the anomalous signal are of comparatively little value. A readily available external quality indicator is CC All/CC Weak as obtained by [[ccp4com:SHELX_C/D/E|SHELXD]]. | ||
Line 229: | Line 325: | ||
[[Optimization]] does improve things as much as it often does: recycling of GXPARM.XDS to use as XPARM.XDS, and thus imposing the lattice symmetry in the geometry refinement in INTEGRATE. These findings my correspond to the fact that in P1 the angles do not refine to 90.0xx or 89.9xx degrees. In other words, the metric symmetry is not as well fulfilled as it should be. This results in fairly large deviations from the ideal P42 positions; the refinement of cell parameters in P1 partly compensates for this. I have however no idea why this deviation from metric symmetry could occur. | [[Optimization]] does improve things as much as it often does: recycling of GXPARM.XDS to use as XPARM.XDS, and thus imposing the lattice symmetry in the geometry refinement in INTEGRATE. These findings my correspond to the fact that in P1 the angles do not refine to 90.0xx or 89.9xx degrees. In other words, the metric symmetry is not as well fulfilled as it should be. This results in fairly large deviations from the ideal P42 positions; the refinement of cell parameters in P1 partly compensates for this. I have however no idea why this deviation from metric symmetry could occur. | ||
== | ===structure solution=== | ||
The resolution limit for SHELXD could be varied. For SHELXE, the solvent content could be varied, and the number of autobuilding cycles, and probably also the high resolution cutoff. | The resolution limit for SHELXD could be varied. For SHELXE, the solvent content could be varied, and the number of autobuilding cycles, and probably also the high resolution cutoff. Furthermore, it would be advantageous to "re-cycle" the file j.hat to j_fa.res, since the heavy-atom sites from SHELXE are more accurate than those from SHELXD, as the phases derived from the poly-Ala traces are quite good (compare the density columns of the two consecutive heavy-atom lists!). | ||
==Limits== | ==Limits== | ||
With dataset 2, I tried to use 270 frames but could not solve the structure using the above SHELXC/D/E approach (not even with MAXIMUM_ERROR_OF_SPOT_POSITION=6.0). With 315 frames, it was possible. | With dataset 2, I tried to use 270 frames but could not solve the structure using the above SHELXC/D/E approach (not even with MAXIMUM_ERROR_OF_SPOT_POSITION=6.0). With 315 frames, it was possible. |