2,684
edits
No edit summary |
|||
Line 1: | Line 1: | ||
The structure is [http://www.rcsb.org/pdb/explore/explore.do?structureId=1Y13 deposited] in the PDB, solved with SAD and refined at a resolution of 2.2 A in spacegroup P4(3)2(1)2 (#96). | |||
The data for this project were provided by Jürgen Bosch (SGPP) and are linked to [http://bl831.als.lbl.gov/example_data_sets/ACA2011/DPWTP-website/index.html the ACA 2011 workshop website]. | |||
There are two high-resolution (2 Å) datasets E1 (wavelength 0.9794Å) and E2 (@ 0.9174Å) collected (with 0.25° increments) at an ALS beamline on June 27, 2004, and a weaker dataset collected earlier at a SSRL beamline. We will only use the former two datasets here. | |||
== | == Dataset E1 == | ||
Use [[generate_XDS.INP]] and run xds once. Based on R-factors in the resulting CORRECT.LP, and an inspection of BKGPIX.cbf, I modified XDS.INP to have | Use [[generate_XDS.INP]] and run [[xds]] once. Based on R-factors in the resulting CORRECT.LP, and an inspection of BKGPIX.cbf, I modified XDS.INP to have | ||
INCLUDE_RESOLUTION_RANGE=40 2.1 ! too weak beyond 2.1 | INCLUDE_RESOLUTION_RANGE=40 2.1 ! too weak beyond 2.1 Å | ||
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=8000. 30000. ! raised from 7000 30000 to mask beamstop | VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=8000. 30000. ! raised from 7000 30000 to mask beamstop | ||
and ran xds again. This is the excerpt from CORRECT.LP : | and ran xds again. | ||
=== Identifying the problem === | |||
This is the excerpt from [[CORRECT.LP]] : | |||
SPACE-GROUP UNIT CELL CONSTANTS UNIQUE Rmeas COMPARED LATTICE- | SPACE-GROUP UNIT CELL CONSTANTS UNIQUE Rmeas COMPARED LATTICE- | ||
Line 45: | Line 51: | ||
SPACE GROUP NUMBER 16 | SPACE GROUP NUMBER 16 | ||
So CORRECT chooses an orthorhombic spacegroup | So CORRECT chooses an orthorhombic spacegroup. | ||
The file continues: | The file continues: | ||
Line 81: | Line 87: | ||
Some comments: | Some comments: | ||
* the "STANDARD DEVIATION OF SPOT POSITION (PIXELS)" is significantly higher (1.01) than those reported for the 5°-batches in INTEGRATE.LP (about 0.6) . This suggests that the geometry refinement has to deal with inconsistent data. | |||
* CORRECT obviously indicates an orthorhombic spacegroup. | * CORRECT obviously indicates an orthorhombic spacegroup. | ||
* the number of MISFITS is higher than 1%. From the first long table (fine-grained in resolution) table in CORRECT.LP we learn that the misfits are due to faint high-resolution ice rings - so | * the number of MISFITS is higher than 1%. From the first long table (fine-grained in resolution) table in CORRECT.LP we learn that the misfits are due to faint high-resolution ice rings - so this is a problem intrinsic to the data, and not to their mode of processing. | ||
To my surprise, pointless does not agree with CORRECT's standpoint: | To my surprise, pointless does not agree with CORRECT's standpoint: | ||
Line 179: | Line 186: | ||
which is much worse than the spacegroup 19 statistics (compare the ISa values - they differ by a factor of 2 !) so there may be something wrong with some assumptions we were making ... | which is much worse than the spacegroup 19 statistics (compare the ISa values - they differ by a factor of 2 !) so there may be something wrong with some assumptions we were making ... | ||
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP. This shows the scale factor (column 3): | === Identifying a possible cause of the problem === | ||
The easiest thing one can do is to inspect INTEGRATE.LP - this lists scale factor, beam divergence and mosaicity for every reflection. There's a [[jiffies|jiffy]] called "scalefactors" which grep's the relevant lines from INTEGRATE.LP ("scalefactors > scales.log"). This shows the scale factor (column 3): | |||
[[File:1y13-e1-scales.png]] | [[File:1y13-e1-scales.png]] | ||
demonstrating that "something happens" between frame 372 and 373 (of course one has to look at the table to find the exact numbers). | |||
'''It should be noted that any abrupt change in conditions during the experiment is going to spoil the resulting data in one way or another. This is most true for a SAD experiment which is supposed to give accurate values for the tiny differences in intensities between Friedel-related reflections.''' | |||
=== Solving the problem === | |||
At this point it is good to look at the data for experiment E2. We find exactly the same problem of bad ISa and high "STANDARD DEVIATION OF SPOT POSITION (PIXELS)" when reducing frames 1-591 in one run of xds. | |||
With this knowledge, we are lead, for E1, to reduce frames 1-372 and 373-592 separately, in spacegroup 96. For E2, we use frames 1-369 and 371-591, respectively. Frame E2-370 has a very high scalefactor. | |||
This is also a good time to closely inspect the headers of the frames: | |||
% grep --binary-files=text DATE ALS/821/1y13/j1603b3PK_1_E1_37?.img | |||
gives | |||
ALS/821/1y13/j1603b3PK_1_E1_370.img:DATE=Sun Jun 27 08:55:51 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_371.img:DATE=Sun Jun 27 08:56:00 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_372.img:DATE=Sun Jun 27 08:56:08 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_373.img:DATE=Sun Jun 27 09:19:45 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_374.img:DATE=Sun Jun 27 09:19:54 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_375.img:DATE=Sun Jun 27 09:20:02 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_376.img:DATE=Sun Jun 27 09:20:10 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_377.img:DATE=Sun Jun 27 09:20:58 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_378.img:DATE=Sun Jun 27 09:21:08 2004; | |||
ALS/821/1y13/j1603b3PK_1_E1_379.img:DATE=Sun Jun 27 09:21:17 2004; | |||
and | |||
% grep --binary-files=text DATE ALS/821/1y13/j1603b3PK_1_E2_3[67]?.img | |||
gives | |||
ALS/821/1y13/j1603b3PK_1_E2_366.img:DATE=Sun Jun 27 08:55:15 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_367.img:DATE=Sun Jun 27 08:55:23 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_368.img:DATE=Sun Jun 27 08:55:32 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_369.img:DATE=Sun Jun 27 08:56:19 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_370.img:DATE=Sun Jun 27 08:56:28 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_371.img:DATE=Sun Jun 27 09:19:26 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_372.img:DATE=Sun Jun 27 09:19:34 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_373.img:DATE=Sun Jun 27 09:20:22 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_374.img:DATE=Sun Jun 27 09:20:30 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_375.img:DATE=Sun Jun 27 09:20:38 2004; | |||
ALS/821/1y13/j1603b3PK_1_E2_376.img:DATE=Sun Jun 27 09:20:47 2004; | |||
thus proving that both datasets were interrupted for 20 minutes around frame 370. | |||
The really weird thing here is that both datasets appear to be collected at the same time, but at different wavelengths (E1 at 0.9794 Å, E2 at 0.9184 Å), and yet the individual parts merge as follows: using the following [[XSCALE.INP]]: | |||
UNIT_CELL_CONSTANTS=103.316 103.316 131.456 90.000 90.000 90.000 | |||
SPACE_GROUP_NUMBER=96 | |||
OUTPUT_FILE=temp.ahkl | |||
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL | |||
INPUT_FILE=../e1_373-592/XDS_ASCII.HKL | |||
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL | |||
INPUT_FILE=../e2_371-591/XDS_ASCII.HKL | |||
and running [[xscale]], we obtain in XSCALE.LP: | |||
CORRELATIONS BETWEEN INPUT DATA SETS AFTER CORRECTIONS | |||
DATA SETS NUMBER OF COMMON CORRELATION RATIO OF COMMON B-FACTOR | |||
#i #j REFLECTIONS BETWEEN i,j INTENSITIES (i/j) BETWEEN i,j | |||
1 2 15943 0.978 1.0002 0.0106 | |||
1 3 22366 1.000 1.0012 -0.0008 | |||
2 3 15801 0.977 0.9983 0.0557 | |||
1 4 15648 0.979 0.9988 0.0541 | |||
2 4 14862 0.999 1.0024 -0.0007 | |||
3 4 15524 0.978 0.9999 -0.0015 | |||
which means that e1_1-372 correlates well (1.000) with e2_1-369, and e1_373-59 well (0.999) with e2_371-591, but the crosswise correlations are consistently low (0.978, 0.977, 0.979, 0.978). The adjustment to the error model proves this: | |||
a b ISa ISa0 INPUT DATA SET | |||
6.112E+00 1.429E-03 10.70 22.37 ../e1_1-372/XDS_ASCII.HKL | |||
1.074E+01 1.825E-03 7.14 23.79 ../e1_373-592/XDS_ASCII.HKL | |||
5.707E+00 1.621E-03 10.40 22.82 ../e2_1-369/XDS_ASCII.HKL | |||
8.547E+00 1.796E-03 8.07 24.17 ../e2_371-591/XDS_ASCII.HKL | |||
telling us that "if we merge these datasets together, their error estimates have to be increased a lot". However, if we switch to | |||
UNIT_CELL_CONSTANTS=103.316 103.316 131.456 90.000 90.000 90.000 | |||
SPACE_GROUP_NUMBER=96 | |||
OUTPUT_FILE=firstparts.ahkl | |||
INPUT_FILE=../e1_1-372/XDS_ASCII.HKL | |||
INPUT_FILE=../e2_1-369/XDS_ASCII.HKL | |||
OUTPUT_FILE=secondparts.ahkl | |||
INPUT_FILE=../e1_373-592/XDS_ASCII.HKL | |||
INPUT_FILE=../e2_371-591/XDS_ASCII.HKL | |||
we obtain | |||
a b ISa ISa0 INPUT DATA SET | |||
6.120E+00 3.673E-04 21.09 22.37 ../e1_1-372/XDS_ASCII.HKL | |||
5.713E+00 3.819E-04 21.41 22.82 ../e2_1-369/XDS_ASCII.HKL | |||
5.639E+00 3.151E-04 23.72 23.79 ../e1_373-592/XDS_ASCII.HKL | |||
5.289E+00 3.258E-04 24.09 24.17 ../e2_371-591/XDS_ASCII.HKL | |||
proving that the second parts of datasets E1 and E2 should be treated separately from the first parts. | |||
Upon inspection of the cell parameters, we find that the cell axes of the second "halfs" are shorter by a factor of 0.9908 when compared with the first parts. This suggests that they were collected at a longer wavelength! But then the wavelength values in the headers are most likely completely wrong: we can speculate that the two first parts were collected at the SeMet peak wavelength, and the two second parts at the inflection wavelength. | |||
Although |