2VB1: Difference between revisions
(start work) |
mNo edit summary |
||
(25 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id [http://www.rcsb.org/pdb/explore/explore.do?structureId=2VB1 2VB1]). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from [http://bl831.als.lbl.gov/example_data_sets/APS/19-ID/2vb1/ here]. Details of data collection, processing and refinement are [http://journals.iucr.org/d/issues/2007/12/00/be5097/index.html published]. | |||
== XDS processing == | == XDS processing == | ||
# use [[generate_XDS.INP]] to obtain a good starting point | |||
# edit [[XDS.INP]] and change/add the following: | |||
ORGX=3130 ORGY=3040 ! for ADSC, header values are subject to interpretation; these values from visual inspection | |||
ORGX=3130 ORGY=3040 ! for ADSC, header values are subject to interpretation; | ! the following is for masking the beamstop shadow in sweeps c-d | ||
UNTRUSTED_RECTANGLE=0 3189 2960 3087 ! use XDS-viewer of ADXV to find the values | |||
! the following is for sweeps e-h | |||
UNTRUSTED_RECTANGLE=1 3160 3000 3070 | |||
TRUSTED_REGION=0 1.5 ! we want the whole detector area | TRUSTED_REGION=0 1.5 ! we want the whole detector area | ||
ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards! | ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards! | ||
* for | SILICON=34.812736 ! account for theta-dependant absorption in the CCD's phosphor. The correction is only | ||
MAXIMUM_NUMBER_OF_PROCESSORS=2 | ! significant for hi-res data; 34.812736=32*(value for silicon as printed to CORRECT.LP if SILICON= not given) | ||
MAXIMUM_NUMBER_OF_PROCESSORS=4 ! for fast processing on a machine with many cores (e.g. for 16 cores) | |||
MAXIMUM_NUMBER_OF_JOBS=6 ! "overcommit" the available cores but on the whole this produces results faster | |||
SPACE_GROUP_NUMBER=1 ! this is known | |||
UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! from 2vb1 | |||
FRIEDEL'S_LAW=TRUE ! we're not concerned with the anomalous signal | |||
Then, run "xds_par". It completes after about 5 minutes on a fast machine, and we may inspect (at least) IDXREF.LP and CORRECT.LP (see below), and use "XDS-viewer FRAME.cbf" to get a visual impression of the integration as it applies to the last frame. | |||
By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely, e.g.: | |||
... | |||
63879 OUT OF 72321 SPOTS INDEXED. | |||
... | |||
***** DIFFRACTION PARAMETERS USED AT START OF INTEGRATION ***** | |||
REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM 63879 INDEXED SPOTS | |||
REFINED PARAMETERS: DISTANCE BEAM AXIS CELL ORIENTATION | |||
STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.53 | |||
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.12 | |||
=== Optimization === | |||
The main target of optimization is the asymptotic (i.e. best) I/sigma (ISa) (Diederichs (2010) [http://dx.doi.org/10.1107/S0907444910014836 Acta Cryst. D 66, 733-40]) as printed out by CORRECT (and XSCALE). A higher ISa should mean better data. | |||
However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. Please note that the default WFAC1 is 1; this should result in the rejection of about 1% of observations. If you feel that 1% is too much then just increase WFAC1, to, say, 1.5 - that should result in rejection of less than (say) 0.1%. This will slightly increase completeness, but will reduce I/sigma and ISa, and increase R-factors. | |||
The following quantities may be tested for their influence on ISa: | |||
* copying GXPARM.XDS to XPARM.XDS | |||
* including the information from the first integration pass into XDS.INP - just do "grep _E INTEGRATE.LP|tail -2" and get e.g. | |||
BEAM_DIVERGENCE= 0.386 BEAM_DIVERGENCE_E.S.D.= 0.039 | |||
REFLECTING_RANGE= 0.669 REFLECTING_RANGE_E.S.D.= 0.096 | |||
copy these two lines into XDS.INP | |||
* prevent refinement in INTEGRATE: REFINE(INTEGRATE)= ! | |||
== Example: sweep e == | |||
=== [[XDS.INP]]; as generated by [[generate_XDS.INP]] === | |||
generate_XDS.INP "../../APS/19-ID/2vb1/p1lyso_e.0???.img" | |||
Then include the changes detailed above, resulting in: | |||
<pre> | |||
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT | |||
MAXIMUM_NUMBER_OF_PROCESSORS=4 | |||
MAXIMUM_NUMBER_OF_JOBS=6 | |||
ORGX= 3130 ORGY= 3040 ! check these values with adxv ! | |||
UNTRUSTED_RECTANGLE=1 3160 3000 3070 ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out | |||
DETECTOR_DISTANCE= 99.9954 | |||
OSCILLATION_RANGE= 0.500 | |||
X-RAY_WAVELENGTH= 0.6525486 | |||
NAME_TEMPLATE_OF_DATA_FRAMES=../../APS/19-ID/2vb1/p1lyso_e.0???.img | |||
! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing | |||
DATA_RANGE=1 360 | |||
SPOT_RANGE=1 180 | |||
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation) | |||
SPACE_GROUP_NUMBER=1 ! 0 if unknown | |||
UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! PDB 2vb1 | |||
INCLUDE_RESOLUTION_RANGE=50 0 ! after CORRECT, insert high resol limit; re-run CORRECT | |||
!FRIEDEL'S_LAW=FALSE ! This acts only on the CORRECT step | |||
! If the anom signal turns out to be, or is known to be, very low or absent, | |||
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT | |||
! remove the "!" in the following line: | |||
! STRICT_ABSORPTION_CORRECTION=TRUE | |||
! if the anomalous signal is strong: in that case, in CORRECT.LP the three | |||
! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5 | |||
! | |||
! exclude (mask) untrusted areas of detector, e.g. beamstop shadow : | |||
! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat | |||
! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed | |||
! | |||
! parameters with changes wrt default values: | |||
TRUSTED_REGION=0.00 1.5 ! partially use corners of detectors; 1.41421=full use | |||
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok | |||
MINIMUM_ZETA=0.05 ! integrate close to the Lorentz zone; 0.15 is default | |||
STRONG_PIXEL=6 ! COLSPOT: only use strong reflections (default is 3) | |||
MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high | |||
REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE | |||
! parameters specifically for this detector and beamline: | |||
DETECTOR= ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000 | |||
SENSOR_THICKNESS=0.01 SILICON=34.812736 | |||
NX= 6144 NY= 6144 QX= 0.051294 QY= 0.051294 ! to make CORRECT happy if frames are unavailable | |||
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0 | |||
DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0 | |||
INCIDENT_BEAM_DIRECTION=0 0 1 | |||
ROTATION_AXIS=-1 0 0 ! at e.g. SERCAT ID-22 this needs to be -1 0 0 | |||
FRACTION_OF_POLARIZATION=0.98 ! better value is provided by beamline staff! | |||
POLARIZATION_PLANE_NORMAL=0 1 0 | |||
</pre> | |||
=== [[CORRECT.LP]] 1st pass === | |||
STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.87 | |||
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.10 | |||
CRYSTAL MOSAICITY (DEGREES) 0.126 | |||
... | |||
a b ISa | |||
6.630E+00 1.091E-04 37.18 | |||
... | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
1.77 9195 4841 9501 51.0% 1.5% 1.5% 8708 48.74 2.1% 1.6% 0% 0.000 0 | |||
1.26 29991 15327 16721 91.7% 1.5% 1.6% 29328 45.26 2.1% 1.7% 0% 0.000 0 | |||
1.03 38643 19731 21636 91.2% 1.7% 1.7% 37824 38.67 2.5% 2.1% 0% 0.000 0 | |||
0.89 46156 23404 25561 91.6% 2.3% 2.4% 45504 27.56 3.3% 3.4% 0% 0.000 0 | |||
0.80 51509 26034 28868 90.2% 4.0% 4.0% 50950 17.55 5.6% 7.0% 0% 0.000 0 | |||
0.73 55989 28253 32034 88.2% 7.0% 6.8% 55472 10.98 9.8% 13.2% 0% 0.000 0 | |||
0.68 59733 30115 34776 86.6% 13.1% 13.0% 59236 6.08 18.6% 26.0% 0% 0.000 0 | |||
0.63 35385 18436 37367 49.3% 25.6% 26.9% 33898 2.99 36.3% 52.1% 0% 0.000 0 | |||
0.60 8991 4972 39725 12.5% 51.2% 56.9% 8038 1.34 72.4% 105.0% 0% 0.000 0 | |||
total 335592 171113 246189 69.5% 2.3% 2.4% 328958 19.58 3.3% 7.4% 0% 0.000 0 | |||
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 343716 | |||
NUMBER OF REJECTED MISFITS 8112 | |||
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 | |||
NUMBER OF ACCEPTED OBSERVATIONS 335604 | |||
NUMBER OF UNIQUE ACCEPTED REFLECTIONS 171119 | |||
The number of "misfits" (rejections) is higher than expected (1 %). Either one considers the anomalous signal (of the 6 sulfurs) to be significant, or one simply increases WFAC1 from its default of 1, to (say) 1.2 . | |||
=== [[XDS.INP]]; optimized === | |||
Using the output of "grep _E INTEGRATE.LP|tail -2" edit XDS.INP to have | |||
JOB= INTEGRATE CORRECT | |||
BEAM_DIVERGENCE= 0.428 BEAM_DIVERGENCE_E.S.D.= 0.043 | |||
REFLECTING_RANGE= 0.880 REFLECTING_RANGE_E.S.D.= 0.126 | |||
... | |||
REFINE(INTEGRATE)= ! | |||
Then "cp GXPARM.XDS XPARM.XDS", and then another round of "xds_par". Five minutes later, we get: | |||
=== [[CORRECT.LP]] optimization pass === | |||
This looks a little bit better - less standard deviation, higher ISa, better R-factors, less misfits: | |||
STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.83 | |||
STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.08 | |||
CRYSTAL MOSAICITY (DEGREES) 0.096 | |||
a b ISa | |||
6.439E+00 1.076E-04 37.98 | |||
... | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
1.77 9149 4817 9501 50.7% 1.5% 1.5% 8664 49.75 2.1% 1.5% 0% 0.000 0 | |||
1.26 30049 15348 16723 91.8% 1.5% 1.6% 29402 46.26 2.1% 1.6% 0% 0.000 0 | |||
1.03 38920 19863 21637 91.8% 1.7% 1.7% 38114 39.61 2.4% 2.0% 0% 0.000 0 | |||
0.89 46381 23508 25562 92.0% 2.2% 2.3% 45746 28.39 3.1% 3.2% 0% 0.000 0 | |||
0.80 51605 26071 28868 90.3% 3.8% 3.8% 51068 18.21 5.3% 6.5% 0% 0.000 0 | |||
0.73 56126 28314 32041 88.4% 6.6% 6.4% 55624 11.45 9.3% 12.3% 0% 0.000 0 | |||
0.68 59735 30093 34771 86.5% 12.6% 12.3% 59284 6.34 17.8% 24.8% 0% 0.000 0 | |||
0.63 35754 18620 37370 49.8% 24.1% 25.5% 34268 3.11 34.1% 48.9% 0% 0.000 0 | |||
0.60 9180 5075 39730 12.8% 48.6% 54.3% 8210 1.40 68.7% 100.5% 0% 0.000 0 | |||
total 336899 171709 246203 69.7% 2.2% 2.3% 330380 20.14 3.2% 6.9% 0% 0.000 0 | |||
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 344751 | |||
NUMBER OF REJECTED MISFITS 7842 | |||
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 | |||
NUMBER OF ACCEPTED OBSERVATIONS 336909 | |||
NUMBER OF UNIQUE ACCEPTED REFLECTIONS 171714 | |||
=== further optimization === | |||
Another round of optimization again improves the R-factors and I/sigma at high resolution a bit, but it also increased the misfits back to 8200. At this point I decided to switch to FRIEDEL'S_LAW=FALSE, and the resulting table is: | |||
NOTE: Friedel pairs are treated as different reflections. | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
1.77 9599 9023 19002 47.5% 1.5% 1.5% 1152 36.81 2.1% 1.6% 0% 0.000 0 | |||
1.26 31196 28239 33446 84.4% 1.4% 1.6% 5914 34.40 2.0% 1.6% 0% 0.000 0 | |||
1.03 40125 35205 43274 81.4% 1.7% 1.7% 9840 30.09 2.4% 2.0% 0% 0.000 0 | |||
0.89 46987 40188 51124 78.6% 2.3% 2.3% 13598 22.03 3.2% 3.4% 0% 0.000 0 | |||
0.80 52229 43723 57738 75.7% 3.9% 3.9% 17012 14.44 5.5% 6.6% 0% 0.000 0 | |||
0.73 56830 46674 64088 72.8% 7.1% 6.8% 20312 9.30 10.1% 13.2% 0% 0.000 0 | |||
0.68 60488 48814 69544 70.2% 13.9% 13.5% 23348 5.26 19.6% 27.1% 0% 0.000 0 | |||
0.63 36190 28598 74736 38.3% 28.2% 29.7% 15184 2.70 39.8% 57.3% 0% 0.000 0 | |||
0.60 9246 7246 79466 9.1% 57.8% 62.4% 4000 1.26 81.8% 122.0% 0% 0.000 0 | |||
total 342890 287710 492418 58.4% 2.8% 2.8% 110360 16.19 3.9% 9.9% 0% 0.000 0 | |||
NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 345355 | |||
NUMBER OF REJECTED MISFITS 2448 | |||
NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 | |||
NUMBER OF ACCEPTED OBSERVATIONS 342907 | |||
NUMBER OF UNIQUE ACCEPTED REFLECTIONS 287724 | |||
Indeed this brings the number of misfits to well below 1%, and it does make some sense. | |||
== XSCALE results == | |||
The same strategy as shown for sweep e was used for sweeps a-d and f-h. XSCALE.INP is: | |||
SPACE_GROUP_NUMBER= 1 | |||
UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! from 2vb1 PDB entry | |||
! cellparm for a-h gives 27.083 31.269 33.773 87.978 107.998 112.133 | |||
OUTPUT_FILE=lys-xds.ahkl | |||
FRIEDEL'S_LAW=TRUE | |||
RESOLUTION_SHELLS=2.91 2.06 1.68 1.45 1.30 1.19 1.10 1.03 0.97 0.92 0.88 0.84 0.81 0.78 0.75 0.73 0.71 0.69 0.67 0.65 | |||
INPUT_FILE=../a/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../b/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../c/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../d/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../e/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../f/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../g/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
INPUT_FILE=../h/XDS_ASCII.HKL | |||
INCLUDE_RESOLUTION_RANGE=30 0.65 | |||
=== XSCALE.LP tables === | |||
The error model is adjusted by XSCALE: | |||
a b ISa ISa0 INPUT DATA SET | |||
7.094E+00 1.294E-04 33.00 38.03 ../a/XDS_ASCII.HKL | |||
7.476E+00 1.170E-04 33.81 38.95 ../b/XDS_ASCII.HKL | |||
7.453E+00 1.598E-04 28.98 38.00 ../c/XDS_ASCII.HKL | |||
6.539E+00 1.640E-04 30.54 39.08 ../d/XDS_ASCII.HKL | |||
7.304E+00 1.342E-04 31.94 37.69 ../e/XDS_ASCII.HKL | |||
8.201E+00 1.574E-04 27.83 35.58 ../f/XDS_ASCII.HKL | |||
8.182E+00 1.759E-04 26.36 27.60 ../g/XDS_ASCII.HKL | |||
7.717E+00 3.694E-04 18.73 21.93 ../h/XDS_ASCII.HKL | |||
and there are about 1500 rejected reflections. It is reassuring to note that the error model works well; the ISa goes down toward sweep h probably because the crystal degrades. But see also the "a posterior remarks" below - sweep h is the one that is most affected by a shadow on the detector. | |||
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION | |||
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano | |||
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr | |||
2.91 16170 2112 2147 98.4% 2.2% 2.4% 16157 78.96 2.5% 1.1% -12% 0.741 2023 | |||
2.06 40349 3831 3856 99.4% 2.4% 2.7% 40345 84.89 2.6% 0.9% -9% 0.764 3803 | |||
1.68 65329 5068 5087 99.6% 3.1% 3.2% 65321 83.77 3.3% 1.0% 0% 0.847 5020 | |||
1.45 73373 6147 6163 99.7% 3.2% 3.5% 73371 78.02 3.4% 1.0% 2% 0.842 6053 | |||
1.30 71196 6651 6657 99.9% 3.2% 3.5% 71196 71.07 3.4% 1.1% 4% 0.857 6503 | |||
1.19 74542 7287 7298 99.8% 3.2% 3.4% 74534 67.06 3.3% 1.2% 5% 0.854 7060 | |||
1.10 84918 8269 8278 99.9% 3.4% 3.7% 84891 63.24 3.6% 1.3% 7% 0.853 7988 | |||
1.03 87890 8584 8603 99.8% 4.1% 4.4% 87855 56.26 4.4% 1.5% 5% 0.818 8231 | |||
0.97 92917 9460 9465 99.9% 5.2% 5.6% 92894 48.90 5.5% 1.7% 4% 0.795 9010 | |||
0.92 83994 9911 9927 99.8% 5.7% 6.3% 83969 41.67 6.0% 2.0% 6% 0.787 9358 | |||
0.88 74100 9620 9621 100.0% 6.3% 7.1% 74082 35.74 6.7% 2.5% 4% 0.772 9040 | |||
0.84 81322 11511 11518 99.9% 6.9% 7.7% 81300 30.43 7.3% 3.3% 1% 0.760 10609 | |||
0.81 67539 10239 10247 99.9% 7.1% 7.7% 67518 25.96 7.7% 4.2% 2% 0.779 9364 | |||
0.78 73980 11807 11817 99.9% 7.1% 7.3% 73951 22.34 7.7% 5.3% 2% 0.799 10699 | |||
0.75 86111 13831 13839 99.9% 8.4% 8.6% 86076 18.77 9.2% 6.8% 2% 0.809 12496 | |||
0.73 64554 10481 10488 99.9% 10.3% 10.4% 64525 15.73 11.3% 8.2% 3% 0.815 9384 | |||
0.71 71891 11727 11741 99.9% 12.8% 13.0% 71844 12.95 14.0% 10.6% 3% 0.810 10436 | |||
0.69 80168 13157 13163 100.0% 16.6% 16.9% 80065 10.16 18.2% 14.1% 2% 0.799 11662 | |||
0.67 84431 14747 14766 99.9% 22.2% 22.7% 84231 7.44 24.4% 19.7% 3% 0.798 12520 | |||
0.65 61031 15592 16551 94.2% 27.6% 30.6% 60165 4.36 31.8% 33.1% 1% 0.723 9005 | |||
total 1435805 190032 191232 99.4% 3.1% 3.3% 1434290 33.42 3.3% 3.1% 3% 0.801 170264 | |||
If two more resolution shells are added, they look like - | |||
0.64 23276 7411 9155 81.0% 35.0% 40.6% 22324 2.90 41.7% 47.9% 3% 0.683 3204 | |||
0.63 18044 6488 9647 67.3% 42.2% 49.7% 16630 2.22 50.7% 60.9% -5% 0.643 2437 | |||
So there is still useful signal beyond 0.65 A. | |||
== Some ''a posteriori'' remarks == | |||
* For sweeps e-h one should use TRUSTED_REGION= 0 1.2 since that already gives 0.626 A in the corners. | |||
* The first and last frames of sweeps g and h show a shadow in one corner of the detector. Nothing was done by me to exclude this shadow from processing (but one should do so at least if the resolution should be expanded beyond 0.65 A which the XSCALE statistics suggest to be possible). <br> One could experiment with MINIMUM_VALID_PIXEL_VALUE= 40 (or so) instead of 1 - I'd probably try that, but of course one does not want to exclude valid pixels so the result has to be carefully checked. <br> Anyway, there is no general facility in XDS to exclude bad areas of ''specific'' frames in a dataset; one needs to chop the dataset into parts and deal with each shadow separately. | |||
== Comparison of data processing: published (2006) ''vs'' XDS results == | |||
<table border = "1"> | |||
<tr><b> | |||
<td> </td> | |||
<td> resolution (highest resolution range) | |||
<td> observations </td> | |||
<td> unique reflections </td> | |||
<td> Multiplicity </td> | |||
<td> Completeness (%) </td> | |||
<td> R merge (%) </td> | |||
<td> mean I/sigma </td> | |||
</b></tr> | |||
<tr><b> | |||
<td> published (2006) </td> | |||
<td> 30-0.65Å (0.67-0.65Å) </td> | |||
<td> 1331953 (12764) </td> | |||
<td> 187165 (6353) </td> | |||
<td> 7.1 (2.7) </td> | |||
<td> 97.6 (67.3) </td> | |||
<td> 4.5 (18.4) </td> | |||
<td> 36.2 (4.2) </td> | |||
</b></tr> | |||
<tr><b> | |||
<td> XDS Version Dec 06, 2010 </td> | |||
<td> 30-0.65Å (0.67-0.65Å) </td> | |||
<td> 1435805 (61031) </td> | |||
<td> 190032 (15592) </td> | |||
<td> 7.5 (3.9) </td> | |||
<td> 99.4 (94.2) </td> | |||
<td> 3.1 (27.6) </td> | |||
<td> 33.4 (4.4) </td> | |||
</b></tr> | |||
</table> | |||
== Availability of data from XDS processing == | |||
I changed XSCALE.INP to have | |||
!FRIEDEL'S_LAW=TRUE ! by commenting it out XSCALE will use FRIEDEL'S_LAW=FALSE | |||
the | ! since this is how the data were processed | ||
RESOLUTION_SHELLS=2.91 2.06 1.68 1.45 1.30 1.19 1.10 1.03 0.97 0.92 0.88 0.84 0.80 0.76 0.73 0.70 0.67 0.65 0.64 0.63 | |||
and ran XSCALE again, to get a file with reflections to 0.63 A. | |||
Conversion to other program systems is performed with XDSCONV. XDSCONV.INP for producing a MTZ file with intensities and anomalous signal is: | |||
INPUT_FILE= lys-xds.ahkl | |||
OUTPUT_FILE=temp.hkl CCP4_I | |||
After running xdsconv, I cut-and-paste the screen output: | |||
f2mtz HKLOUT temp.mtz<F2MTZ.INP | |||
cad HKLIN1 temp.mtz HKLOUT output_file_name.mtz<<EOF | |||
LABIN FILE 1 ALL | |||
END | |||
EOF | |||
and obtain output_file_name.mtz which I mv to [https://{{SERVERNAME}}/pub/xds-datared/2vb1/xds-hewl-I.mtz xds-hewl-I.mtz]. SFCHECK statistics for this file are [https://{{SERVERNAME}}/pub/xds-datared/2vb1/sfcheck_XXXX.pdf here]. | |||
Similarly, using OUTPUT_FILE=temp.hkl CCP4 I obtained a file with amplitudes, [https://{{SERVERNAME}}/pub/xds-datared/2vb1/xds-hewl-F.mtz xds-hewl-F.mtz] | |||
Latest revision as of 14:13, 24 March 2020
This reports processing of triclinic hen egg-white lysozyme data @ 0.65Å resolution (PDB id 2VB1). Data (sweeps a to h, each comprising 60 to 360 frames of 72MB) were collected by Zbigniew Dauter at APS 19-ID and are available from here. Details of data collection, processing and refinement are published.
XDS processing
- use generate_XDS.INP to obtain a good starting point
- edit XDS.INP and change/add the following:
ORGX=3130 ORGY=3040 ! for ADSC, header values are subject to interpretation; these values from visual inspection ! the following is for masking the beamstop shadow in sweeps c-d UNTRUSTED_RECTANGLE=0 3189 2960 3087 ! use XDS-viewer of ADXV to find the values ! the following is for sweeps e-h UNTRUSTED_RECTANGLE=1 3160 3000 3070 TRUSTED_REGION=0 1.5 ! we want the whole detector area ROTATION_AXIS=-1 0 0 ! at this beamline the spindle goes backwards! SILICON=34.812736 ! account for theta-dependant absorption in the CCD's phosphor. The correction is only ! significant for hi-res data; 34.812736=32*(value for silicon as printed to CORRECT.LP if SILICON= not given) MAXIMUM_NUMBER_OF_PROCESSORS=4 ! for fast processing on a machine with many cores (e.g. for 16 cores) MAXIMUM_NUMBER_OF_JOBS=6 ! "overcommit" the available cores but on the whole this produces results faster SPACE_GROUP_NUMBER=1 ! this is known UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! from 2vb1 FRIEDEL'S_LAW=TRUE ! we're not concerned with the anomalous signal
Then, run "xds_par". It completes after about 5 minutes on a fast machine, and we may inspect (at least) IDXREF.LP and CORRECT.LP (see below), and use "XDS-viewer FRAME.cbf" to get a visual impression of the integration as it applies to the last frame. By inspecting IDXREF.LP, one should make sure that everything works as it should, i.e. that a large percentage of reflections was actually indexed nicely, e.g.:
... 63879 OUT OF 72321 SPOTS INDEXED. ... ***** DIFFRACTION PARAMETERS USED AT START OF INTEGRATION ***** REFINED VALUES OF DIFFRACTION PARAMETERS DERIVED FROM 63879 INDEXED SPOTS REFINED PARAMETERS: DISTANCE BEAM AXIS CELL ORIENTATION STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.53 STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.12
Optimization
The main target of optimization is the asymptotic (i.e. best) I/sigma (ISa) (Diederichs (2010) Acta Cryst. D 66, 733-40) as printed out by CORRECT (and XSCALE). A higher ISa should mean better data.
However: ISa also rises if more reflections are thrown out as outliers ("misfits") so it is not considered to be optimization if just WFAC1 is reduced. Please note that the default WFAC1 is 1; this should result in the rejection of about 1% of observations. If you feel that 1% is too much then just increase WFAC1, to, say, 1.5 - that should result in rejection of less than (say) 0.1%. This will slightly increase completeness, but will reduce I/sigma and ISa, and increase R-factors.
The following quantities may be tested for their influence on ISa:
- copying GXPARM.XDS to XPARM.XDS
- including the information from the first integration pass into XDS.INP - just do "grep _E INTEGRATE.LP|tail -2" and get e.g.
BEAM_DIVERGENCE= 0.386 BEAM_DIVERGENCE_E.S.D.= 0.039 REFLECTING_RANGE= 0.669 REFLECTING_RANGE_E.S.D.= 0.096
copy these two lines into XDS.INP
- prevent refinement in INTEGRATE: REFINE(INTEGRATE)= !
Example: sweep e
XDS.INP; as generated by generate_XDS.INP
generate_XDS.INP "../../APS/19-ID/2vb1/p1lyso_e.0???.img"
Then include the changes detailed above, resulting in:
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT MAXIMUM_NUMBER_OF_PROCESSORS=4 MAXIMUM_NUMBER_OF_JOBS=6 ORGX= 3130 ORGY= 3040 ! check these values with adxv ! UNTRUSTED_RECTANGLE=1 3160 3000 3070 ! <xmin xmax ymin ymax> to mask shadow of beamstop; XDS-viewer to find out DETECTOR_DISTANCE= 99.9954 OSCILLATION_RANGE= 0.500 X-RAY_WAVELENGTH= 0.6525486 NAME_TEMPLATE_OF_DATA_FRAMES=../../APS/19-ID/2vb1/p1lyso_e.0???.img ! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing DATA_RANGE=1 360 SPOT_RANGE=1 180 ! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation) SPACE_GROUP_NUMBER=1 ! 0 if unknown UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! PDB 2vb1 INCLUDE_RESOLUTION_RANGE=50 0 ! after CORRECT, insert high resol limit; re-run CORRECT !FRIEDEL'S_LAW=FALSE ! This acts only on the CORRECT step ! If the anom signal turns out to be, or is known to be, very low or absent, ! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT ! remove the "!" in the following line: ! STRICT_ABSORPTION_CORRECTION=TRUE ! if the anomalous signal is strong: in that case, in CORRECT.LP the three ! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5 ! ! exclude (mask) untrusted areas of detector, e.g. beamstop shadow : ! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat ! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed ! ! parameters with changes wrt default values: TRUSTED_REGION=0.00 1.5 ! partially use corners of detectors; 1.41421=full use VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok MINIMUM_ZETA=0.05 ! integrate close to the Lorentz zone; 0.15 is default STRONG_PIXEL=6 ! COLSPOT: only use strong reflections (default is 3) MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE ! parameters specifically for this detector and beamline: DETECTOR= ADSC MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65000 SENSOR_THICKNESS=0.01 SILICON=34.812736 NX= 6144 NY= 6144 QX= 0.051294 QY= 0.051294 ! to make CORRECT happy if frames are unavailable DIRECTION_OF_DETECTOR_X-AXIS=1 0 0 DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0 INCIDENT_BEAM_DIRECTION=0 0 1 ROTATION_AXIS=-1 0 0 ! at e.g. SERCAT ID-22 this needs to be -1 0 0 FRACTION_OF_POLARIZATION=0.98 ! better value is provided by beamline staff! POLARIZATION_PLANE_NORMAL=0 1 0
CORRECT.LP 1st pass
STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.87 STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.10 CRYSTAL MOSAICITY (DEGREES) 0.126 ... a b ISa 6.630E+00 1.091E-04 37.18 ... SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 1.77 9195 4841 9501 51.0% 1.5% 1.5% 8708 48.74 2.1% 1.6% 0% 0.000 0 1.26 29991 15327 16721 91.7% 1.5% 1.6% 29328 45.26 2.1% 1.7% 0% 0.000 0 1.03 38643 19731 21636 91.2% 1.7% 1.7% 37824 38.67 2.5% 2.1% 0% 0.000 0 0.89 46156 23404 25561 91.6% 2.3% 2.4% 45504 27.56 3.3% 3.4% 0% 0.000 0 0.80 51509 26034 28868 90.2% 4.0% 4.0% 50950 17.55 5.6% 7.0% 0% 0.000 0 0.73 55989 28253 32034 88.2% 7.0% 6.8% 55472 10.98 9.8% 13.2% 0% 0.000 0 0.68 59733 30115 34776 86.6% 13.1% 13.0% 59236 6.08 18.6% 26.0% 0% 0.000 0 0.63 35385 18436 37367 49.3% 25.6% 26.9% 33898 2.99 36.3% 52.1% 0% 0.000 0 0.60 8991 4972 39725 12.5% 51.2% 56.9% 8038 1.34 72.4% 105.0% 0% 0.000 0 total 335592 171113 246189 69.5% 2.3% 2.4% 328958 19.58 3.3% 7.4% 0% 0.000 0 NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 343716 NUMBER OF REJECTED MISFITS 8112 NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 NUMBER OF ACCEPTED OBSERVATIONS 335604 NUMBER OF UNIQUE ACCEPTED REFLECTIONS 171119
The number of "misfits" (rejections) is higher than expected (1 %). Either one considers the anomalous signal (of the 6 sulfurs) to be significant, or one simply increases WFAC1 from its default of 1, to (say) 1.2 .
XDS.INP; optimized
Using the output of "grep _E INTEGRATE.LP|tail -2" edit XDS.INP to have
JOB= INTEGRATE CORRECT BEAM_DIVERGENCE= 0.428 BEAM_DIVERGENCE_E.S.D.= 0.043 REFLECTING_RANGE= 0.880 REFLECTING_RANGE_E.S.D.= 0.126 ... REFINE(INTEGRATE)= !
Then "cp GXPARM.XDS XPARM.XDS", and then another round of "xds_par". Five minutes later, we get:
CORRECT.LP optimization pass
This looks a little bit better - less standard deviation, higher ISa, better R-factors, less misfits:
STANDARD DEVIATION OF SPOT POSITION (PIXELS) 0.83 STANDARD DEVIATION OF SPINDLE POSITION (DEGREES) 0.08 CRYSTAL MOSAICITY (DEGREES) 0.096 a b ISa 6.439E+00 1.076E-04 37.98 ... SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 1.77 9149 4817 9501 50.7% 1.5% 1.5% 8664 49.75 2.1% 1.5% 0% 0.000 0 1.26 30049 15348 16723 91.8% 1.5% 1.6% 29402 46.26 2.1% 1.6% 0% 0.000 0 1.03 38920 19863 21637 91.8% 1.7% 1.7% 38114 39.61 2.4% 2.0% 0% 0.000 0 0.89 46381 23508 25562 92.0% 2.2% 2.3% 45746 28.39 3.1% 3.2% 0% 0.000 0 0.80 51605 26071 28868 90.3% 3.8% 3.8% 51068 18.21 5.3% 6.5% 0% 0.000 0 0.73 56126 28314 32041 88.4% 6.6% 6.4% 55624 11.45 9.3% 12.3% 0% 0.000 0 0.68 59735 30093 34771 86.5% 12.6% 12.3% 59284 6.34 17.8% 24.8% 0% 0.000 0 0.63 35754 18620 37370 49.8% 24.1% 25.5% 34268 3.11 34.1% 48.9% 0% 0.000 0 0.60 9180 5075 39730 12.8% 48.6% 54.3% 8210 1.40 68.7% 100.5% 0% 0.000 0 total 336899 171709 246203 69.7% 2.2% 2.3% 330380 20.14 3.2% 6.9% 0% 0.000 0 NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 344751 NUMBER OF REJECTED MISFITS 7842 NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 NUMBER OF ACCEPTED OBSERVATIONS 336909 NUMBER OF UNIQUE ACCEPTED REFLECTIONS 171714
further optimization
Another round of optimization again improves the R-factors and I/sigma at high resolution a bit, but it also increased the misfits back to 8200. At this point I decided to switch to FRIEDEL'S_LAW=FALSE, and the resulting table is:
NOTE: Friedel pairs are treated as different reflections. SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 1.77 9599 9023 19002 47.5% 1.5% 1.5% 1152 36.81 2.1% 1.6% 0% 0.000 0 1.26 31196 28239 33446 84.4% 1.4% 1.6% 5914 34.40 2.0% 1.6% 0% 0.000 0 1.03 40125 35205 43274 81.4% 1.7% 1.7% 9840 30.09 2.4% 2.0% 0% 0.000 0 0.89 46987 40188 51124 78.6% 2.3% 2.3% 13598 22.03 3.2% 3.4% 0% 0.000 0 0.80 52229 43723 57738 75.7% 3.9% 3.9% 17012 14.44 5.5% 6.6% 0% 0.000 0 0.73 56830 46674 64088 72.8% 7.1% 6.8% 20312 9.30 10.1% 13.2% 0% 0.000 0 0.68 60488 48814 69544 70.2% 13.9% 13.5% 23348 5.26 19.6% 27.1% 0% 0.000 0 0.63 36190 28598 74736 38.3% 28.2% 29.7% 15184 2.70 39.8% 57.3% 0% 0.000 0 0.60 9246 7246 79466 9.1% 57.8% 62.4% 4000 1.26 81.8% 122.0% 0% 0.000 0 total 342890 287710 492418 58.4% 2.8% 2.8% 110360 16.19 3.9% 9.9% 0% 0.000 0 NUMBER OF REFLECTIONS IN SELECTED SUBSET OF IMAGES 345355 NUMBER OF REJECTED MISFITS 2448 NUMBER OF SYSTEMATIC ABSENT REFLECTIONS 0 NUMBER OF ACCEPTED OBSERVATIONS 342907 NUMBER OF UNIQUE ACCEPTED REFLECTIONS 287724
Indeed this brings the number of misfits to well below 1%, and it does make some sense.
XSCALE results
The same strategy as shown for sweep e was used for sweeps a-d and f-h. XSCALE.INP is:
SPACE_GROUP_NUMBER= 1 UNIT_CELL_CONSTANTS= 27.07 31.25 33.76 87.98 108.00 112.11 ! from 2vb1 PDB entry ! cellparm for a-h gives 27.083 31.269 33.773 87.978 107.998 112.133
OUTPUT_FILE=lys-xds.ahkl FRIEDEL'S_LAW=TRUE RESOLUTION_SHELLS=2.91 2.06 1.68 1.45 1.30 1.19 1.10 1.03 0.97 0.92 0.88 0.84 0.81 0.78 0.75 0.73 0.71 0.69 0.67 0.65 INPUT_FILE=../a/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../b/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../c/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../d/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../e/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../f/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../g/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65 INPUT_FILE=../h/XDS_ASCII.HKL INCLUDE_RESOLUTION_RANGE=30 0.65
XSCALE.LP tables
The error model is adjusted by XSCALE:
a b ISa ISa0 INPUT DATA SET 7.094E+00 1.294E-04 33.00 38.03 ../a/XDS_ASCII.HKL 7.476E+00 1.170E-04 33.81 38.95 ../b/XDS_ASCII.HKL 7.453E+00 1.598E-04 28.98 38.00 ../c/XDS_ASCII.HKL 6.539E+00 1.640E-04 30.54 39.08 ../d/XDS_ASCII.HKL 7.304E+00 1.342E-04 31.94 37.69 ../e/XDS_ASCII.HKL 8.201E+00 1.574E-04 27.83 35.58 ../f/XDS_ASCII.HKL 8.182E+00 1.759E-04 26.36 27.60 ../g/XDS_ASCII.HKL 7.717E+00 3.694E-04 18.73 21.93 ../h/XDS_ASCII.HKL
and there are about 1500 rejected reflections. It is reassuring to note that the error model works well; the ISa goes down toward sweep h probably because the crystal degrades. But see also the "a posterior remarks" below - sweep h is the one that is most affected by a shadow on the detector.
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 2.91 16170 2112 2147 98.4% 2.2% 2.4% 16157 78.96 2.5% 1.1% -12% 0.741 2023 2.06 40349 3831 3856 99.4% 2.4% 2.7% 40345 84.89 2.6% 0.9% -9% 0.764 3803 1.68 65329 5068 5087 99.6% 3.1% 3.2% 65321 83.77 3.3% 1.0% 0% 0.847 5020 1.45 73373 6147 6163 99.7% 3.2% 3.5% 73371 78.02 3.4% 1.0% 2% 0.842 6053 1.30 71196 6651 6657 99.9% 3.2% 3.5% 71196 71.07 3.4% 1.1% 4% 0.857 6503 1.19 74542 7287 7298 99.8% 3.2% 3.4% 74534 67.06 3.3% 1.2% 5% 0.854 7060 1.10 84918 8269 8278 99.9% 3.4% 3.7% 84891 63.24 3.6% 1.3% 7% 0.853 7988 1.03 87890 8584 8603 99.8% 4.1% 4.4% 87855 56.26 4.4% 1.5% 5% 0.818 8231 0.97 92917 9460 9465 99.9% 5.2% 5.6% 92894 48.90 5.5% 1.7% 4% 0.795 9010 0.92 83994 9911 9927 99.8% 5.7% 6.3% 83969 41.67 6.0% 2.0% 6% 0.787 9358 0.88 74100 9620 9621 100.0% 6.3% 7.1% 74082 35.74 6.7% 2.5% 4% 0.772 9040 0.84 81322 11511 11518 99.9% 6.9% 7.7% 81300 30.43 7.3% 3.3% 1% 0.760 10609 0.81 67539 10239 10247 99.9% 7.1% 7.7% 67518 25.96 7.7% 4.2% 2% 0.779 9364 0.78 73980 11807 11817 99.9% 7.1% 7.3% 73951 22.34 7.7% 5.3% 2% 0.799 10699 0.75 86111 13831 13839 99.9% 8.4% 8.6% 86076 18.77 9.2% 6.8% 2% 0.809 12496 0.73 64554 10481 10488 99.9% 10.3% 10.4% 64525 15.73 11.3% 8.2% 3% 0.815 9384 0.71 71891 11727 11741 99.9% 12.8% 13.0% 71844 12.95 14.0% 10.6% 3% 0.810 10436 0.69 80168 13157 13163 100.0% 16.6% 16.9% 80065 10.16 18.2% 14.1% 2% 0.799 11662 0.67 84431 14747 14766 99.9% 22.2% 22.7% 84231 7.44 24.4% 19.7% 3% 0.798 12520 0.65 61031 15592 16551 94.2% 27.6% 30.6% 60165 4.36 31.8% 33.1% 1% 0.723 9005 total 1435805 190032 191232 99.4% 3.1% 3.3% 1434290 33.42 3.3% 3.1% 3% 0.801 170264
If two more resolution shells are added, they look like -
0.64 23276 7411 9155 81.0% 35.0% 40.6% 22324 2.90 41.7% 47.9% 3% 0.683 3204 0.63 18044 6488 9647 67.3% 42.2% 49.7% 16630 2.22 50.7% 60.9% -5% 0.643 2437
So there is still useful signal beyond 0.65 A.
Some a posteriori remarks
- For sweeps e-h one should use TRUSTED_REGION= 0 1.2 since that already gives 0.626 A in the corners.
- The first and last frames of sweeps g and h show a shadow in one corner of the detector. Nothing was done by me to exclude this shadow from processing (but one should do so at least if the resolution should be expanded beyond 0.65 A which the XSCALE statistics suggest to be possible).
One could experiment with MINIMUM_VALID_PIXEL_VALUE= 40 (or so) instead of 1 - I'd probably try that, but of course one does not want to exclude valid pixels so the result has to be carefully checked.
Anyway, there is no general facility in XDS to exclude bad areas of specific frames in a dataset; one needs to chop the dataset into parts and deal with each shadow separately.
Comparison of data processing: published (2006) vs XDS results
resolution (highest resolution range) | observations | unique reflections | Multiplicity | Completeness (%) | R merge (%) | mean I/sigma | |
published (2006) | 30-0.65Å (0.67-0.65Å) | 1331953 (12764) | 187165 (6353) | 7.1 (2.7) | 97.6 (67.3) | 4.5 (18.4) | 36.2 (4.2) |
XDS Version Dec 06, 2010 | 30-0.65Å (0.67-0.65Å) | 1435805 (61031) | 190032 (15592) | 7.5 (3.9) | 99.4 (94.2) | 3.1 (27.6) | 33.4 (4.4) |
Availability of data from XDS processing
I changed XSCALE.INP to have
!FRIEDEL'S_LAW=TRUE ! by commenting it out XSCALE will use FRIEDEL'S_LAW=FALSE ! since this is how the data were processed RESOLUTION_SHELLS=2.91 2.06 1.68 1.45 1.30 1.19 1.10 1.03 0.97 0.92 0.88 0.84 0.80 0.76 0.73 0.70 0.67 0.65 0.64 0.63
and ran XSCALE again, to get a file with reflections to 0.63 A.
Conversion to other program systems is performed with XDSCONV. XDSCONV.INP for producing a MTZ file with intensities and anomalous signal is:
INPUT_FILE= lys-xds.ahkl OUTPUT_FILE=temp.hkl CCP4_I
After running xdsconv, I cut-and-paste the screen output:
f2mtz HKLOUT temp.mtz<F2MTZ.INP cad HKLIN1 temp.mtz HKLOUT output_file_name.mtz<<EOF LABIN FILE 1 ALL END EOF
and obtain output_file_name.mtz which I mv to xds-hewl-I.mtz. SFCHECK statistics for this file are here.
Similarly, using OUTPUT_FILE=temp.hkl CCP4 I obtained a file with amplitudes, xds-hewl-F.mtz