2,684
edits
No edit summary |
No edit summary |
||
Line 156: | Line 156: | ||
38.300 79.097 79.100 90.000 90.000 90.000 19.0 | 38.300 79.097 79.100 90.000 90.000 90.000 19.0 | ||
Why not use all datasets? The reason is that cellparm has a limit of 20 datasets! | Why not use all datasets? The reason is that cellparm has a limit of 20 datasets! But it seems to confirm that the cell axes are really 38.3, 79.1, 79.1. | ||
Now we run xscale with the following XSCALE.INP : | Now we run xscale with the following XSCALE.INP : | ||
<pre> | <pre> | ||
OUTPUT_FILE=temp.ahkl | OUTPUT_FILE=temp.ahkl | ||
Line 273: | Line 271: | ||
DATA SETS NUMBER OF COMMON CORRELATION RATIO OF COMMON B-FACTOR | DATA SETS NUMBER OF COMMON CORRELATION RATIO OF COMMON B-FACTOR | ||
#i #j REFLECTIONS BETWEEN i,j INTENSITIES (i/j) BETWEEN i,j | #i #j REFLECTIONS BETWEEN i,j INTENSITIES (i/j) BETWEEN i,j | ||
with these 99 lines: | with these final 99 lines: | ||
1 100 12 0.601 0.8200 0.0085 | 1 100 12 0.601 0.8200 0.0085 | ||
2 100 24 0.998 0.9001 0.5637 | 2 100 24 0.998 0.9001 0.5637 | ||
Line 420: | Line 418: | ||
total 10297 7912 22966 34.5% 5.6% 5.9% 4363 9.17 7.6% 11.5% -9% 0.741 24 | total 10297 7912 22966 34.5% 5.6% 5.9% 4363 9.17 7.6% 11.5% -9% 0.741 24 | ||
Now we are ready to run our script "bootstrap.rc" a second time. Actually it would be enough to run the CORRECT step but since it only takes 2 minutes we don't bother to change the script. After this, we run xscale a third time, using the same XSCALE.INP as the first time. The result is | == second round of bootstrap == | ||
Now we are ready to run our script "bootstrap.rc" a second time. Actually it would be enough to run the CORRECT step but since it only takes 2 minutes we don't bother to change the script. After this, we run xscale a third time, using the same XSCALE.INP (with all its 100 INPUT_FILE= lines) as the first time. The result is | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | ||
Line 450: | Line 450: | ||
so the data are practically complete, and actually quite good. The anomalous signal suggests that it may be possible to solve the structure from its anomalous signal. | so the data are practically complete, and actually quite good. The anomalous signal suggests that it may be possible to solve the structure from its anomalous signal. | ||
We can find out the correct spacegroup (19 !) with "pointless xdsin temp.ahkl". | We can find out the correct spacegroup (19 !) with "pointless xdsin temp.ahkl", and adjust our script accordingly. | ||
Now we do another round, since the completeness is so good. We can then identify those few datasets which are still not indexed in the right setting, fix those manually. It was only xtal085 which | Now we do another round, since the completeness is so good. We can then identify those few datasets which are still not indexed in the right setting, and fix those manually. It was only xtal085 which had a problem - it turned out that the indexing had not found the correct lattice, which was fixed with STRONG_PIXEL=6. | ||
The final XSCALE.LP is then: | The final XSCALE.LP is then: | ||
Line 485: | Line 485: | ||
== Optimizing the result == | == Optimizing the result == | ||
One method to improve XDS' knowledge of geometry would be to use all 15 frames for indexing, but still only to integrate frame 1. This is easily accomplished by changing in the script: | |||
JOB=XYCORR INIT COLSPOT IDXREF DEFPIX | |||
DATA_RANGE=1 15 | |||
SPOT_RANGE=1 15 | |||
and to use, instead of "xds >& xds.log &" the line "../../run_xds.rc &" with the following run_xds.rc : | |||
<pre> | |||
#!/bin/csh -f | |||
xds | |||
egrep -v 'DATA_RANGE|JOB' XDS.INP >x | |||
echo JOB=INTEGRATE CORRECT >XDS.INP | |||
echo DATA_RANGE=1 1 >> XDS.INP | |||
cat x >> XDS.INP | |||
xds | |||
</pre> | |||
Furthermore it seems good to change "sleep 1" to "sleep 5" because now each COLSPOT has to look at 15 frames, not one. Thus, this takes a little bit longer. Indeed the result is a bit better: | |||
WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
8.05 798 274 304 90.1% 4.4% 4.2% 726 23.88 5.2% 3.1% 71% 1.932 49 | |||
5.69 1514 480 515 93.2% 4.5% 4.5% 1421 23.66 5.3% 3.4% 76% 1.670 83 | |||
4.65 1951 599 639 93.7% 4.3% 4.4% 1845 24.57 5.0% 3.3% 67% 1.561 139 | |||
4.03 2399 713 753 94.7% 4.1% 4.5% 2289 24.76 4.8% 3.1% 44% 1.176 154 | |||
3.60 2546 786 840 93.6% 3.9% 4.5% 2417 23.78 4.6% 3.1% 46% 1.127 175 | |||
3.29 2864 876 919 95.3% 4.2% 4.7% 2729 23.35 4.9% 3.2% 38% 1.018 199 | |||
3.04 3154 918 987 93.0% 5.0% 5.2% 3037 21.98 5.8% 3.9% 18% 0.922 231 | |||
2.85 3387 1015 1066 95.2% 5.9% 6.1% 3235 18.74 7.0% 5.2% 26% 0.992 235 | |||
2.68 3724 1082 1126 96.1% 7.2% 7.2% 3583 17.03 8.4% 6.7% 15% 0.890 278 | |||
2.55 3720 1111 1172 94.8% 8.3% 8.6% 3536 15.02 9.7% 8.1% 14% 0.857 255 | |||
2.43 4079 1198 1267 94.6% 9.8% 10.6% 3898 12.96 11.5% 10.3% 9% 0.781 290 | |||
2.32 4199 1221 1283 95.2% 11.1% 11.7% 4024 12.21 12.9% 10.8% 12% 0.911 331 | |||
2.23 4365 1282 1350 95.0% 11.4% 12.2% 4205 11.87 13.4% 12.6% 3% 0.729 319 | |||
2.15 4651 1332 1386 96.1% 13.3% 13.9% 4468 11.30 15.5% 12.5% 5% 0.821 354 | |||
2.08 4745 1380 1455 94.8% 15.0% 16.0% 4569 10.04 17.6% 14.0% -1% 0.760 358 | |||
2.01 4744 1418 1496 94.8% 15.4% 16.0% 4531 9.50 18.1% 16.3% 5% 0.820 343 | |||
1.95 5019 1487 1550 95.9% 19.6% 19.7% 4813 8.27 23.0% 19.7% -1% 0.765 359 | |||
1.90 5210 1504 1571 95.7% 21.9% 22.9% 5007 7.53 25.6% 22.8% -6% 0.740 399 | |||
1.85 5272 1561 1633 95.6% 29.1% 30.1% 5054 5.98 34.1% 28.8% 4% 0.801 366 | |||
1.80 5054 1505 1659 90.7% 33.2% 34.1% 4822 5.25 38.9% 35.2% -1% 0.790 354 | |||
total 73395 21742 22971 94.6% 7.3% 7.7% 70209 13.46 8.6% 9.8% 16% 0.890 5271 | |||
but there does not appear a "magic bullet" that would produce much better data than with the quick bootstrap approach. | |||
== Solving the structure == | |||
First, we repeat xscale after inserting FRIEDEL'S_LAW=FALSE into XSCALE.INP . This gives us | |||
NOTE: Friedel pairs are treated as different reflections. | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
8.05 804 382 476 80.3% 3.1% 3.4% 665 24.13 3.9% 2.7% 81% 2.507 50 | |||
5.69 1527 723 882 82.0% 3.4% 3.6% 1251 22.48 4.2% 3.1% 85% 2.223 87 | |||
4.65 1956 938 1136 82.6% 3.4% 3.6% 1602 22.73 4.3% 3.0% 72% 1.821 141 | |||
4.03 2400 1136 1357 83.7% 3.5% 3.6% 1943 22.62 4.4% 3.2% 46% 1.347 154 | |||
3.60 2549 1261 1533 82.3% 3.4% 3.7% 2053 21.53 4.3% 3.3% 51% 1.322 176 | |||
3.29 2867 1393 1694 82.2% 3.7% 3.9% 2347 21.22 4.7% 3.5% 35% 1.159 199 | |||
3.04 3154 1507 1830 82.3% 4.5% 4.3% 2607 19.33 5.7% 4.5% 17% 1.016 231 | |||
2.85 3389 1649 1979 83.3% 5.3% 5.2% 2761 16.37 6.7% 6.0% 27% 1.054 235 | |||
2.68 3724 1757 2104 83.5% 6.5% 6.1% 3088 14.63 8.1% 7.8% 15% 0.962 278 | |||
2.55 3720 1813 2197 82.5% 7.3% 7.6% 2999 12.84 9.2% 9.1% 16% 0.896 255 | |||
2.43 4079 1933 2384 81.1% 9.0% 9.5% 3352 11.01 11.3% 12.5% 9% 0.840 290 | |||
2.32 4199 2006 2420 82.9% 10.0% 10.5% 3474 10.17 12.7% 13.8% 14% 0.939 331 | |||
2.23 4363 2099 2551 82.3% 10.6% 11.0% 3595 9.91 13.4% 14.5% 5% 0.790 319 | |||
2.15 4651 2203 2634 83.6% 12.2% 12.5% 3827 9.29 15.3% 15.7% 7% 0.856 354 | |||
2.08 4745 2248 2758 81.5% 14.2% 14.7% 3945 8.32 18.0% 18.7% -2% 0.822 358 | |||
2.01 4744 2287 2843 80.4% 14.3% 14.6% 3896 7.92 18.1% 19.2% 7% 0.868 343 | |||
1.95 5019 2429 2945 82.5% 18.5% 18.3% 4079 6.76 23.3% 24.6% 0% 0.789 359 | |||
1.90 5210 2484 3000 82.8% 20.4% 21.0% 4282 6.06 25.6% 27.9% -4% 0.757 399 | |||
1.85 5272 2569 3119 82.4% 27.8% 28.0% 4272 4.77 35.0% 36.5% 4% 0.803 366 | |||
1.80 5054 2451 3171 77.3% 30.9% 31.1% 4092 4.20 39.0% 43.1% -3% 0.788 354 | |||
total 73426 35268 43013 82.0% 6.5% 6.7% 60130 11.57 8.2% 11.7% 20% 0.963 5279 | |||
One hint towards the contents of the "crystal" is that the information about the simulated data contained the strings "1g1c". This structure is solved (in a different spacegroup!) and can be found in the PDB; it contains 99 residues, among which there are 2 Cys and 2 Met. Thus we assume that the simulated data represent sulfur-SAD. Using [[ccp4:hkl2map|hkl2map]], we can easily find four sites with good CCall/CCweak: | |||
shelxe.beta -m50 -a -q -h -s0.77 -b -i 1g1c 1g1c_fa |