2QVO.xds: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
(correct CCmax to CC All)
No edit summary
Line 1: Line 1:
==XDS data reduction==
==XDS data reduction==


===dataset 2===
===dataset 1===
This is a pared-down XDS.INP (obtained by egrep -v '^ *!' XDS.INP) based upon XDS-MARCDD.INP from the XDS distribution site  - it has only those lines that are not commented out (to arrive here, one takes the steps outlined in [[Tutorial(First_Steps)]]):
 
DETECTOR=CCDCHESS       MINIMUM_VALID_PIXEL_VALUE=1     OVERLOAD=65000
Using "generate_XDS.INP ../../APS/22-ID/2qvo/ACA10_AF1382_1.0???" we obtain:
  DIRECTION_OF_DETECTOR_X-AXIS= 1.0 0.0 0.0
JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
  DIRECTION_OF_DETECTOR_Y-AXIS= 0.0 1.0 0.0
ORGX= 1996.00 ORGY= 2028.00  ! check these values with adxv !
  TRUSTED_REGION=0.0 0.99 !Relative radii limiting trusted detector region
DETECTOR_DISTANCE= 125.000
  MAXIMUM_NUMBER_OF_PROCESSORS=8!<25;ignored by single cpu version of xds
OSCILLATION_RANGE= 1.000
  JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
X-RAY_WAVELENGTH= 1.90000
  ORGX=2000 ORGY=2048 !Detector origin (pixels)! numbers are rough estimates w/ adxv
NAME_TEMPLATE_OF_DATA_FRAMES=../../APS/22-ID/2qvo/ACA10_AF1382_1.0???
  DETECTOR_DISTANCE= 125.0  !(mm)
! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing 
  ROTATION_AXIS= 1.0 0.0 0.0
DATA_RANGE=1 360
  OSCILLATION_RANGE=1.0           !degrees (>0)
SPOT_RANGE=1 180
  X-RAY_WAVELENGTH=1.9         !Angstroem
! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)
  INCIDENT_BEAM_DIRECTION=0.0 0.0 1.0
 
  FRACTION_OF_POLARIZATION=0.95 !default=0.5 for unpolarized beam
SPACE_GROUP_NUMBER=0                  ! 0 if unknown
  POLARIZATION_PLANE_NORMAL= 0.0 1.0 0.0
UNIT_CELL_CONSTANTS= 70 80 90 90 90 90 ! put correct values if known
  SPACE_GROUP_NUMBER=!0 for unknown crystals; cell constants are ignored.
INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT
  FRIEDEL'S_LAW=FALSE !Default is TRUE.
 
  NAME_TEMPLATE_OF_DATA_FRAMES=../../g/040707-8_2_2_1.???? ! TIFF
 
DATA_RANGE=1 360      !Numbers of first and last data image collected
FRIEDEL'S_LAW=FALSE    ! This acts only on the CORRECT step
BACKGROUND_RANGE=1 5  !Numbers of first and last data image for background
! If the anom signal turns out to be, or is known to be, very low or absent,
  SPOT_RANGE=1 180      !First and last data image number for finding spots
! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT
  REFINE(IDXREF)=BEAM AXIS ORIENTATION CELL DISTANCE
 
  REFINE(INTEGRATE)=DISTANCE BEAM ORIENTATION CELL !AXIS
! remove the "!" in the following line:
  REFINE(CORRECT)=DISTANCE BEAM ORIENTATION CELL AXIS
! STRICT_ABSORPTION_CORRECTION=TRUE
  VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS= 6000 30000 !Used by DEFPIX for excluding shaded parts of the detector.
! if the anomalous signal is strong: in that case, in CORRECT.LP the three
INCLUDE_RESOLUTION_RANGE=50.0 0 !Angstroem; used by DEFPIX,INTEGRATE,CORRECT
! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5
MINIMUM_ZETA=0.1 !Defines width of 'blind region' (XPLAN,INTEGRATE,CORRECT)
!
WFAC1=1.5 !This controls the number of rejected MISFITS in CORRECT; a larger value leads to fewer rejections.
! exclude (mask) untrusted areas of detector, e.g. beamstop shadow :
STRONG_PIXEL=6.0                             !used by: COLSPOT
! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat
! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed
!
! parameters with changes wrt default values:
TRUSTED_REGION=0.00 1.2  ! partially use corners of detectors; 1.41421=full use
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok
MINIMUM_ZETA=0.05        ! integrate close to the Lorentz zone; 0.15 is default
STRONG_PIXEL=6          ! COLSPOT: only use strong reflections (default is 3)
MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high
REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE
 
! parameters specifically for this detector and beamline:
DETECTOR= CCDCHESS MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65500
NX= 4096 NY= 4096 QX= .0732420000  QY= .0732420000 ! to make CORRECT happy if frames are unavailable
DIRECTION_OF_DETECTOR_X-AXIS=1 0 0
DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0
INCIDENT_BEAM_DIRECTION=0 0 1
ROTATION_AXIS=1 0 0    ! at e.g. SERCAT ID-22 this needs to be -1 0 0
FRACTION_OF_POLARIZATION=0.98  ! better value is provided by beamline staff!
POLARIZATION_PLANE_NORMAL=0 1 0
 
Now we run xds_par. This runs to completion. We should at least inspect, using XDS-Viewer, the file FRAME.cbf since this shows us the last frame of the dataset, with boxes superimposed which correspond to the expected locations of reflections.
 
The automatic spacegroup determination (CORRECT.LP) comes up with
  LATTICE-  BRAVAIS-  QUALITY  UNIT CELL CONSTANTS (ANGSTROEM & DEGREES)    REINDEXING TRANSFORMATION
CHARACTER  LATTICE    OF FIT      a      b      c  alpha  beta gamma
*  44        aP          0.0      41.2  53.5  53.5  90.3  90.1  90.1  -1  0  0  0  0  1  0  0  0  0 -1  0
*  31        aP          0.8      41.2  53.5  53.5  89.7  90.1  89.9    1  0  0  0  0  1  0  0  0  0 0
*  25        mC          1.4      75.4  75.8  41.2  90.0  90.1  90.0   0 1 -1  0  0 -1 -1  0 -1  0  0  0
*  35        mP          1.8      53.5  41.2  53.5  90.1  90.3  90.1    0 -1  0  0  1  0  0  0  0  0  1  0
  *  23        oC          3.1      75.4  75.8  41.2  90.0  90.1  90.0    0  1 -1  0  0 -1 -1  0 -1  0  0  0
*  20        mC          3.9      75.8  75.4  41.2  90.1  90.0 90.0   0  1  1 0  0  1 -1  0 -1  0  0  0
*  34        mP          5.1      41.2  53.5  53.5  90.3  90.1  90.1    1  0  0  0  0 0 1  0  0 -1  0  0
*  33        mP          5.3      41.2  53.5  53.5  90.3  90.1  90.1  -1  0  0  0  0  1  0  0  0  0 -1  0
  *  32        oP          6.1      41.2  53.5  53.5  90.3  90.1  90.1  -1  0  0  0  0  1  0  0  0  0 -1  0
*  21        tP          7.3      53.5  53.5  41.2  90.1  90.1  90.3    0  1  0  0  0  0 -1  0 -1  0  0 0
    39        mC        249.8    114.5  41.2  53.5  90.1  90.3  69.0    1 -2  0  0  1  0  0  0  0  0  1  0
and further down lists
SPACE-GROUP        UNIT CELL CONSTANTS            UNIQUE  Rmeas  COMPARED  LATTICE-
  NUMBER      a      b      c  alpha beta gamma                            CHARACTER
   
      5      75.8  75.4  41.2  90.0  90.0  90.0    900    40.8     5882    20 mC
  *  75      53.5  53.5  41.2 90.0 90.0 90.0    469    8.4    6313    21 tP
      89      53.5  53.5  41.2  90.0 90.0 90.0    282    39.2    6500    21 tP
      21      75.4  75.8  41.2  90.0 90.0  90.0     506    39.8    6276    23 oC
      5      75.4   75.8  41.2  90.0  90.1  90.0    901    40.7    5881    25 mC
      1      41.2  53.5  53.5  89.7 90.1 89.9    1699    8.2    5083    31 aP
      16      41.2  53.5  53.5  90.0 90.0  90.0    521    39.8    6261    32 oP
      3      53.5  41.2  53.5  90.0 90.3  90.0     931    8.2    5851    35 mP
      3      41.2  53.5  53.5  90.0 90.1  90.0     918    40.7    5864    33 mP
      3      41.2  53.5  53.5  90.0 90.1 90.0     918    40.9    5864    34 mP
      1      41.2  53.5  53.5  90.3  90.1  90.1    1699    8.2    5083    44 aP
 
thus suggesting spacegroup #75 but we should know that this does not take screw axes into account. Therefore we use "pointless xdsin XDS_ASCII.HKL" and are told that this is actually spacegroup P4_2 (# 77). Alternatively, we could have inspected the list further down in CORRECT.LP:
  REFLECTIONS OF TYPE H,0,0  0,K,0  0,0,L OR EXPECTED TO BE ABSENT (*)
  --------------------------------------------------------------------
  H    K    L  RESOLUTION  INTENSITY    SIGMA    INTENSITY/SIGMA  #OBSERVED
   
    0    0    1   41.248  0.8487E+01  0.1339E+01         6.34          4
    0    0    3    13.749  -0.7977E-03 0.3786E+01        0.00          4
    0   0   4    10.312  0.1305E+06  0.4660E+04        27.99          1  
    0    0    5    8.250  0.1318E+01  0.6316E+01        0.21          4
    0    0    6    6.875  0.2939E+05  0.5284E+03        55.61          4
    0    0    7    5.893  0.5439E+01 0.9235E+01        0.59          4
    0    0    8    5.156  0.1298E+05  0.2371E+03        54.73          4
    0    0    9    4.583  0.3308E+02  0.1327E+02        2.49          4
    0    0  10    4.125  0.3809E+05  0.6867E+03        55.47          4
    0    0  11    3.750 -0.1987E+02  0.1767E+02        -1.12          4
    0    0   12    3.437  0.5539E+04  0.1097E+03        50.48          4
    0    0  13    3.173  0.2144E+01 0.2071E+02        0.10          4
    0    0  14    2.946   0.2717E+04  0.6252E+02        43.46          4
    0    0  15    2.750  0.1350E+02 0.2482E+02        0.54          4
    0    0  16    2.578  0.1178E+04 0.4383E+02        26.88          4
    0    0  17    2.426  -0.7420E+01  0.3549E+02        -0.21          4
    0    0  18    2.292  0.4104E+03 0.4681E+02        8.77          4
and realize that this also tells us that the spacegroup is 77, not 75.
 
After his comes the table that tells us the quality of our data:
      NOTE:      Friedel pairs are treated as different reflections.
   
  SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
  RESOLUTION    NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA  R-meas  Rmrgd-F  Anomal  SigAno  Nano
  LIMIT    OBSERVED  UNIQUE POSSIBLE    OF DATA  observed  expected                                      Corr
   
    6.06        4189    556      560      99.3%      2.4%      2.7%    4187  66.74    2.6%    1.1%    74%  1.841    247
    4.31        7575    1008      1008      100.0%      2.6%      2.9%    7575  62.90    2.8%    1.2%    62%  1.463    473
    3.53        9468    1283      1283      100.0%      3.4%      3.2%    9468  53.37    3.6%    1.7%    41%  1.200    612
    3.06      11364    1540      1540      100.0%      5.1%      4.7%    11364  34.45    5.5%    3.1%    17%  0.995    739
    2.74      12628    1695      1695      100.0%      10.2%    10.4%    12628  17.09    11.0%    7.9%    2%  0.797    819
    2.50       14121    1916      1916      100.0%      21.5%    23.1%    14121    8.42    23.1%    17.1%    -4%  0.691    926
    2.31      15155    2079      2079      100.0%      46.6%    50.5%    15155    3.92    50.2%    38.6%    -1%  0.734    1010
    2.16      12185    2104      2228      94.4%    113.3%    117.0%    12178    1.44  124.7%  119.0%    5%  0.753    1018
    2.04        5134    1601      2347      68.2%    274.7%    291.2%    4913    0.40  325.5%  400.7%    1%  0.608    606
    total      91819  13782    14656      94.0%      5.7%      5.9%    91589  20.24    6.2%    15.0%    12%  0.897    6450
So the anomalous signal goes to about 3.3 A (which is where 30% would be, in the "Anomal Corr" column), and the useful resolution goes to 2.16 A, I'd say (pls note that this table treats Friedels separately; merging them increases I/sigma by another factor of 1.41).


Using the above as XDS.INP, we run xds_par for the first time. It will stop after the IDXREF step with the usual error message
We could now modify XDS.INP to have
!!! ERROR !!! INSUFFICIENT PERCENTAGE (< 70%) OF INDEXED REFLECTIONS
  JOB=CORRECT ! not XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
AUTOMATIC DATA PROCESSING STOPPED. AS THE CRITERIA FOR A GOOD
  SPACE_GROUP_NUMBER=  77
SOLUTION ARE RATHER STRICT, YOU MAY CHOOSE TO CONTINUE DATA
PROCESSING AFTER CHANGING THE "JOB="-CARD IN "XDS.INP" TO
"JOB= DEFPIX INTEGRATE CORRECT".
IF THE BEST SOLUTION IS REALLY NONSENSE YOU SHOULD FIRST HAVE
A LOOK AT THE ASCII-FILE "SPOT.XDS". THIS FILE CONTAINS THE
INITIAL SPOT LIST SORTED IN DECREASING SPOT INTENSITY. SPOTS
NEAR THE END OF THE FILE MAY BE ARTEFACTS AND SHOULD BE ERASED.
ALTERNATIVELY YOU MAY TRY DIFFERENT VALUES FOR "INDEX_ORIGIN"
AS SUGGESTED IN THE ABOVE LISTING.
IF THE CRYSTAL HAS SLIPPED AT THE BEGINNING OF DATA COLLECTION
YOU MAY CHOOSE TO SKIP SOME OF THE FIRST FRAMES BY CHANGING
THE "DATA_RANGE=" IN FILE "XDS.INP" AND START ALL OVER AGAIN.
We choose to continue nevertheless and modify XDS.INP to have
  JOB=  DEFPIX INTEGRATE CORRECT
Again we run xds_par. This runs to completion. The automatic spacegroup determination comes up with
  SPACE_GROUP_NUMBER=  75
  UNIT_CELL_CONSTANTS=    53.10    53.10    40.90  90.000  90.000  90.000
  UNIT_CELL_CONSTANTS=    53.10    53.10    40.90  90.000  90.000  90.000
Now we copy these two lines to XDS.INP, replacing the old line SPACE_GROUP_NUMBER=0 . Then we modify the spacegroup number to 77 because we know that the true spacegroup is P4_2. Also, we modify the JOB line once again:
and run xds again, to obtain the final CORRECT.LP and XDS_ASCII.HKL with the correct spacegroup, but the statistics in 75 and 77 are the same, for all practical purposes (the 8 reflections known to be extinct do not make much difference).
JOB= CORRECT
and run xds_par for the last time.  


The resulting output files are XYCORR.LP, INIT.LP, COLSPOT.LP, IDXREF.LP, DEFPIX.LP, INTEGRATE.LP and CORRECT.LP. Data files are XPARM.XDS (from IDXREF), and the XDS_ASCII.HKL file all of which can be downloaded from [[Media:Xds_2qvo.tar.bz2.png|here]] (right-click with the mouse, and then save the file to your disk).
Following this, we create XDSCONV.INP with the lines
SPACE_GROUP_NUMBER=  77  ! can leave out if CORRECT already ran in #77
UNIT_CELL_CONSTANTS=  53.10 53.10 40.90 90 90 90 ! same here
INPUT_FILE=XDS_ASCII.HKL
OUTPUT_FILE=temp.hkl CCP4
and run "xdsconv", and then
<pre>
f2mtz HKLOUT temp.mtz<F2MTZ.INP
cad HKLIN1 temp.mtz HKLOUT output_file_name.mtz<<EOF
LABIN FILE 1 ALL
END
EOF
</pre>
which gives us output_file_name.mtz, which we rename to xds-2ovo-1-F.mtz. Similarly, using
OUTPUT_FILE=temp.hkl CCP4_I
we end up with a MTZ file with intensities, which we rename to xds-2ovo-1-I.mtz.


===dataset 1===
===dataset 2===
This works exactly the same way as dataset 2, except that we have to replace ../../g/040707-8_2_2_1.???? by f/040707-8_2_2_1.???? where f points to the directory with the frames. All .LP files, XPARM.XDS and XDS_ASCII.HKL are [[Media:Xds 2qvo dataset1.tar.bz2.png|here]] (right-click).
This works exactly the same way as dataset 1.


==SHELXC/D/E structure solution==
==SHELXC/D/E structure solution==


This is done in a subdirectory of the XDS data reduction directory. Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]]:
This is done in a subdirectory of the XDS data reduction directory. Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and [[ccp4com:SHELX_C/D/E|SHELXC]]:
 
<pre>
  #!/bin/csh -f
#!/bin/csh -f
   
   
cat > XDSCONV.INP <<end
cat > XDSCONV.INP <<end
INPUT_FILE=../XDS_ASCII.HKL
INPUT_FILE=../XDS_ASCII.HKL
OUTPUT_FILE=temp.hkl SHELX
OUTPUT_FILE=temp.hkl SHELX
MERGE=TRUE
MERGE=TRUE
FRIEDEL'S_LAW=FALSE
FRIEDEL'S_LAW=FALSE
end
end
   
   
xdsconv  
xdsconv  
   
   
shelxc j <<end
shelxc j <<end
SAD  temp.hkl
SAD  temp.hkl
CELL 53.10 53.10 40.90 90 90 90
CELL 53.10 53.10 40.90 90 90 90
SPAG P42
SPAG P42
MAXM 2
MAXM 2
end
end
   
   
This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now:
This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now:
cat > j_fa.ins <<end
cat > j_fa.ins <<end
TITL j_fa.ins SAD in P42
TITL j_fa.ins SAD in P42
CELL  0.98000  53.10  53.10  40.90  90.00  90.00  90.00
CELL  0.98000  53.10  53.10  40.90  90.00  90.00  90.00
LATT  -1
LATT  -1
SYMM -Y, X, 1/2+Z
SYMM -Y, X, 1/2+Z
SYMM -X, -Y, Z
SYMM -X, -Y, Z
SYMM Y, -X, 1/2+Z
SYMM Y, -X, 1/2+Z
SFAC S
SFAC S
UNIT  128
UNIT  128
SHEL 999 3.0
SHEL 999 3.0
FIND 3
FIND 3
NTRY 100
NTRY 100
MIND -1.0 2.2
MIND -1.0 2.2
ESEL 1.3
ESEL 1.3
TEST 0 99
TEST 0 99
SEED 1
SEED 1
PATS
PATS
HKLF 3
HKLF 3
END
END
end
end
   
   
shelxd j_fa
shelxd j_fa


This gives best CC All/Weak of 35.61 / 26.03 for dataset 2, and best CC All/Weak of 36.74 / 21.55 for dataset 1.  
This gives best CC All/Weak of 35.61 / 26.03 for dataset 2, and best CC All/Weak of 36.74 / 21.55 for dataset 1.  

Revision as of 11:30, 14 March 2011

XDS data reduction

dataset 1

Using "generate_XDS.INP ../../APS/22-ID/2qvo/ACA10_AF1382_1.0???" we obtain: JOB= XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT ORGX= 1996.00 ORGY= 2028.00  ! check these values with adxv ! DETECTOR_DISTANCE= 125.000 OSCILLATION_RANGE= 1.000 X-RAY_WAVELENGTH= 1.90000 NAME_TEMPLATE_OF_DATA_FRAMES=../../APS/22-ID/2qvo/ACA10_AF1382_1.0??? ! REFERENCE_DATA_SET=xxx/XDS_ASCII.HKL ! e.g. to ensure consistent indexing DATA_RANGE=1 360 SPOT_RANGE=1 180 ! BACKGROUND_RANGE=1 10 ! rather use defaults (first 5 degree of rotation)

SPACE_GROUP_NUMBER=0  ! 0 if unknown UNIT_CELL_CONSTANTS= 70 80 90 90 90 90 ! put correct values if known INCLUDE_RESOLUTION_RANGE=50 0  ! after CORRECT, insert high resol limit; re-run CORRECT


FRIEDEL'S_LAW=FALSE  ! This acts only on the CORRECT step ! If the anom signal turns out to be, or is known to be, very low or absent, ! use FRIEDEL'S_LAW=TRUE instead (or comment out the line); re-run CORRECT

! remove the "!" in the following line: ! STRICT_ABSORPTION_CORRECTION=TRUE ! if the anomalous signal is strong: in that case, in CORRECT.LP the three ! "CHI^2-VALUE OF FIT OF CORRECTION FACTORS" values are significantly> 1, e.g. 1.5 ! ! exclude (mask) untrusted areas of detector, e.g. beamstop shadow : ! UNTRUSTED_RECTANGLE= 1800 1950 2100 2150 ! x-min x-max y-min y-max ! repeat ! UNTRUSTED_ELLIPSE= 2034 2070 1850 2240 ! x-min x-max y-min y-max ! if needed ! ! parameters with changes wrt default values: TRUSTED_REGION=0.00 1.2  ! partially use corners of detectors; 1.41421=full use VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS=7000. 30000. ! often 8000 is ok MINIMUM_ZETA=0.05  ! integrate close to the Lorentz zone; 0.15 is default STRONG_PIXEL=6  ! COLSPOT: only use strong reflections (default is 3) MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=3 ! default of 6 is sometimes too high REFINE(INTEGRATE)=CELL BEAM ORIENTATION ! AXIS DISTANCE

! parameters specifically for this detector and beamline: DETECTOR= CCDCHESS MINIMUM_VALID_PIXEL_VALUE= 1 OVERLOAD= 65500 NX= 4096 NY= 4096 QX= .0732420000 QY= .0732420000 ! to make CORRECT happy if frames are unavailable DIRECTION_OF_DETECTOR_X-AXIS=1 0 0 DIRECTION_OF_DETECTOR_Y-AXIS=0 1 0 INCIDENT_BEAM_DIRECTION=0 0 1 ROTATION_AXIS=1 0 0  ! at e.g. SERCAT ID-22 this needs to be -1 0 0 FRACTION_OF_POLARIZATION=0.98  ! better value is provided by beamline staff! POLARIZATION_PLANE_NORMAL=0 1 0

Now we run xds_par. This runs to completion. We should at least inspect, using XDS-Viewer, the file FRAME.cbf since this shows us the last frame of the dataset, with boxes superimposed which correspond to the expected locations of reflections.

The automatic spacegroup determination (CORRECT.LP) comes up with

 LATTICE-  BRAVAIS-   QUALITY  UNIT CELL CONSTANTS (ANGSTROEM & DEGREES)    REINDEXING TRANSFORMATION
CHARACTER  LATTICE     OF FIT      a      b      c   alpha  beta gamma

*  44        aP          0.0      41.2   53.5   53.5  90.3  90.1  90.1   -1  0  0  0  0  1  0  0  0  0 -1  0
*  31        aP          0.8      41.2   53.5   53.5  89.7  90.1  89.9    1  0  0  0  0  1  0  0  0  0  1  0
*  25        mC          1.4      75.4   75.8   41.2  90.0  90.1  90.0    0  1 -1  0  0 -1 -1  0 -1  0  0  0
*  35        mP          1.8      53.5   41.2   53.5  90.1  90.3  90.1    0 -1  0  0  1  0  0  0  0  0  1  0
*  23        oC          3.1      75.4   75.8   41.2  90.0  90.1  90.0    0  1 -1  0  0 -1 -1  0 -1  0  0  0
*  20        mC          3.9      75.8   75.4   41.2  90.1  90.0  90.0    0  1  1  0  0  1 -1  0 -1  0  0  0
*  34        mP          5.1      41.2   53.5   53.5  90.3  90.1  90.1    1  0  0  0  0  0  1  0  0 -1  0  0
*  33        mP          5.3      41.2   53.5   53.5  90.3  90.1  90.1   -1  0  0  0  0  1  0  0  0  0 -1  0
*  32        oP          6.1      41.2   53.5   53.5  90.3  90.1  90.1   -1  0  0  0  0  1  0  0  0  0 -1  0
*  21        tP          7.3      53.5   53.5   41.2  90.1  90.1  90.3    0  1  0  0  0  0 -1  0 -1  0  0  0
   39        mC        249.8     114.5   41.2   53.5  90.1  90.3  69.0    1 -2  0  0  1  0  0  0  0  0  1  0

and further down lists

SPACE-GROUP         UNIT CELL CONSTANTS            UNIQUE   Rmeas  COMPARED  LATTICE-
  NUMBER      a      b      c   alpha beta gamma                            CHARACTER

      5      75.8   75.4   41.2  90.0  90.0  90.0     900    40.8     5882    20 mC
  *  75      53.5   53.5   41.2  90.0  90.0  90.0     469     8.4     6313    21 tP
     89      53.5   53.5   41.2  90.0  90.0  90.0     282    39.2     6500    21 tP
     21      75.4   75.8   41.2  90.0  90.0  90.0     506    39.8     6276    23 oC
      5      75.4   75.8   41.2  90.0  90.1  90.0     901    40.7     5881    25 mC
      1      41.2   53.5   53.5  89.7  90.1  89.9    1699     8.2     5083    31 aP
     16      41.2   53.5   53.5  90.0  90.0  90.0     521    39.8     6261    32 oP
      3      53.5   41.2   53.5  90.0  90.3  90.0     931     8.2     5851    35 mP
      3      41.2   53.5   53.5  90.0  90.1  90.0     918    40.7     5864    33 mP
      3      41.2   53.5   53.5  90.0  90.1  90.0     918    40.9     5864    34 mP
      1      41.2   53.5   53.5  90.3  90.1  90.1    1699     8.2     5083    44 aP

thus suggesting spacegroup #75 but we should know that this does not take screw axes into account. Therefore we use "pointless xdsin XDS_ASCII.HKL" and are told that this is actually spacegroup P4_2 (# 77). Alternatively, we could have inspected the list further down in CORRECT.LP:

  REFLECTIONS OF TYPE H,0,0  0,K,0  0,0,L OR EXPECTED TO BE ABSENT (*)
  --------------------------------------------------------------------

  H    K    L  RESOLUTION  INTENSITY     SIGMA    INTENSITY/SIGMA  #OBSERVED

   0    0    1    41.248   0.8487E+01  0.1339E+01         6.34           4 
   0    0    3    13.749  -0.7977E-03  0.3786E+01         0.00           4 
   0    0    4    10.312   0.1305E+06  0.4660E+04        27.99           1 
   0    0    5     8.250   0.1318E+01  0.6316E+01         0.21           4 
   0    0    6     6.875   0.2939E+05  0.5284E+03        55.61           4 
   0    0    7     5.893   0.5439E+01  0.9235E+01         0.59           4 
   0    0    8     5.156   0.1298E+05  0.2371E+03        54.73           4 
   0    0    9     4.583   0.3308E+02  0.1327E+02         2.49           4 
   0    0   10     4.125   0.3809E+05  0.6867E+03        55.47           4 
   0    0   11     3.750  -0.1987E+02  0.1767E+02        -1.12           4 
   0    0   12     3.437   0.5539E+04  0.1097E+03        50.48           4 
   0    0   13     3.173   0.2144E+01  0.2071E+02         0.10           4 
   0    0   14     2.946   0.2717E+04  0.6252E+02        43.46           4 
   0    0   15     2.750   0.1350E+02  0.2482E+02         0.54           4 
   0    0   16     2.578   0.1178E+04  0.4383E+02        26.88           4 
   0    0   17     2.426  -0.7420E+01  0.3549E+02        -0.21           4 
   0    0   18     2.292   0.4104E+03  0.4681E+02         8.77           4 

and realize that this also tells us that the spacegroup is 77, not 75.

After his comes the table that tells us the quality of our data:

      NOTE:      Friedel pairs are treated as different reflections.

SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
  LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    6.06        4189     556       560       99.3%       2.4%      2.7%     4187   66.74     2.6%     1.1%    74%   1.841     247
    4.31        7575    1008      1008      100.0%       2.6%      2.9%     7575   62.90     2.8%     1.2%    62%   1.463     473
    3.53        9468    1283      1283      100.0%       3.4%      3.2%     9468   53.37     3.6%     1.7%    41%   1.200     612
    3.06       11364    1540      1540      100.0%       5.1%      4.7%    11364   34.45     5.5%     3.1%    17%   0.995     739
    2.74       12628    1695      1695      100.0%      10.2%     10.4%    12628   17.09    11.0%     7.9%     2%   0.797     819
    2.50       14121    1916      1916      100.0%      21.5%     23.1%    14121    8.42    23.1%    17.1%    -4%   0.691     926
    2.31       15155    2079      2079      100.0%      46.6%     50.5%    15155    3.92    50.2%    38.6%    -1%   0.734    1010
    2.16       12185    2104      2228       94.4%     113.3%    117.0%    12178    1.44   124.7%   119.0%     5%   0.753    1018
    2.04        5134    1601      2347       68.2%     274.7%    291.2%     4913    0.40   325.5%   400.7%     1%   0.608     606
   total       91819   13782     14656       94.0%       5.7%      5.9%    91589   20.24     6.2%    15.0%    12%   0.897    6450

So the anomalous signal goes to about 3.3 A (which is where 30% would be, in the "Anomal Corr" column), and the useful resolution goes to 2.16 A, I'd say (pls note that this table treats Friedels separately; merging them increases I/sigma by another factor of 1.41).

We could now modify XDS.INP to have

JOB=CORRECT  ! not XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT
SPACE_GROUP_NUMBER=   77
UNIT_CELL_CONSTANTS=    53.10    53.10    40.90  90.000  90.000  90.000

and run xds again, to obtain the final CORRECT.LP and XDS_ASCII.HKL with the correct spacegroup, but the statistics in 75 and 77 are the same, for all practical purposes (the 8 reflections known to be extinct do not make much difference).

Following this, we create XDSCONV.INP with the lines

SPACE_GROUP_NUMBER=   77  ! can leave out if CORRECT already ran in #77
UNIT_CELL_CONSTANTS=  53.10 53.10 40.90 90 90 90 ! same here
INPUT_FILE=XDS_ASCII.HKL
OUTPUT_FILE=temp.hkl CCP4

and run "xdsconv", and then

f2mtz HKLOUT temp.mtz<F2MTZ.INP
cad HKLIN1 temp.mtz HKLOUT output_file_name.mtz<<EOF
LABIN FILE 1 ALL
END
EOF

which gives us output_file_name.mtz, which we rename to xds-2ovo-1-F.mtz. Similarly, using

OUTPUT_FILE=temp.hkl CCP4_I

we end up with a MTZ file with intensities, which we rename to xds-2ovo-1-I.mtz.

dataset 2

This works exactly the same way as dataset 1.

SHELXC/D/E structure solution

This is done in a subdirectory of the XDS data reduction directory. Here, we generate XDSCONV.INP (I used MERGE=TRUE, sometimes the results are better that way) and run xdsconv and SHELXC:

#!/bin/csh -f
 
cat > XDSCONV.INP <<end
INPUT_FILE=../XDS_ASCII.HKL
OUTPUT_FILE=temp.hkl SHELX
MERGE=TRUE
FRIEDEL'S_LAW=FALSE
end
 
xdsconv 
 
shelxc j <<end
SAD   temp.hkl
CELL 53.10 53.10 40.90 90 90 90
SPAG P42
MAXM 2
end
 
This writes j.hkl, j_fa.hkl and j_fa.ins. However, we overwrite j_fa.ins now:
cat > j_fa.ins <<end
TITL j_fa.ins SAD in P42
CELL  0.98000   53.10   53.10   40.90   90.00   90.00   90.00
LATT  -1
SYMM -Y, X, 1/2+Z
SYMM -X, -Y, Z
SYMM Y, -X, 1/2+Z
SFAC S
UNIT   128
SHEL 999 3.0
FIND 3
NTRY 100
MIND -1.0 2.2
ESEL 1.3
TEST 0 99
SEED 1
PATS
HKLF 3
END
end
 
shelxd j_fa

This gives best CC All/Weak of 35.61 / 26.03 for dataset 2, and best CC All/Weak of 36.74 / 21.55 for dataset 1. 

Next we run G. Sheldrick's beta-Version of SHELXE Version 2009/4:

 shelxe.beta j j_fa -a6 -q -h -s0.55 -m20 -b 

Some important lines in the output: for dataset 2, I get
    79 residues left after pruning, divided into chains as follows:
 A:  20   B:  22   C:  37
 
 CC for partial structure against native data =  50.42 %
 ...
  <wt> = 0.300, Contrast = 0.731, Connect. = 0.817 for dens.mod. cycle 20
 ...
 Estimated mean FOM = 0.659   Pseudo-free CC = 68.71 %

for dataset 1, I get
    80 residues left after pruning, divided into chains as follows:
 A:  23   B:  57

 CC for partial structure against native data =  45.79 %
 ...
 <wt> = 0.300, Contrast = 0.711, Connect. = 0.812 for dens.mod. cycle 20
 ...
 Estimated mean FOM = 0.611   Pseudo-free CC = 63.70 %


clearly indicating that the structure can be solved with each of the two datasets individually.


For completeness, we run the inverse hand:

 shelxe.beta j j_fa -a6 -q -h -s0.55 -m20 -b -i

but of course this gives much worse statistics.

Optimization of data reduction

The only safe way to optimize the data reduction is to look at external quality indicators. Internal R-factors, and even the correlation coefficient of the anomalous signal are of comparatively little value. A readily available external quality indicator is CC All/CC Weak as obtained by SHELXD. WFAC1 was already discussed above. Another candidate for optimization is MAXIMUM_ERROR_OF_SPOT_POSITION. By default this is 3.0 . In the case of these data, this default appears to be too small, because the STANDARD DEVIATION OF SPOT POSITION (PIXELS) (as reported by IDXREF, INTEGRATE and CORRECT after refinement) is quite high (1.5 and more). This prevents XDS from using all the reflections for geometry refinement. I found that MAXIMUM_ERROR_OF_SPOT_POSITION=6.0 significantly improved the internal statistics (mostly the r-factors, but not so much the correlation coefficient of the anom signal), and improved CC All/CC Weak indicators (to more than 40). SHELXE then produces significantly better and more complete models. Try for yourself! One thing I noticed that if I specify the known spacegroup in IDXREF then the results are worse than if the integration is performed in P1. Likewise, optimization did not work: recycling of GXPARM.XDS to use as XPARM.XDS, and thus imposing the lattice symmetry in the geometry refinement in INTEGRATE. These findings my correspond to the fact that in P1 the angles do not refine to 90.0xx or 89.9xx degrees. In other words, the metric symmetry is not as well fulfilled as it should be. This results in fairly large deviations from the ideal P42 positions; the refinement of cell parameters in P1 partly compensates for this. I have however no idea why this deviation from metric symmetry could occur.

Optimization of structure solution

There are some parameters in the SHELXC/D/E approach above that could be optimized as well: first of all, MERGE=TRUE in XDSCONV.INP turned later out to be the wrong choice (using the default MERGE=FALSE turns out to give a model with 85 consecutive residues for dataset 1). Then of course, the resolution limit for SHELXD could be varied, and the solvent content for SHELXE. For SHELXE in particular, many things could be tried.

Limits

With dataset 2, I tried to use 270 frames but could not solve the structure using the above SHELXC/D/E approach (not even with MAXIMUM_ERROR_OF_SPOT_POSITION=6.0). With 315 frames, it was possible.