SSX
Round 1: processing the data, and determining the space group
Using
#!/bin/bash -f for f in `seq 1 100`; do export OUT=wedge0`printf "%03d" $f` export NAMES="$PWD/Illuin/microfocus/xtal"`printf "%03d" $f`"_1_00\?.img" rm -rf $OUT mkdir $OUT cd $OUT generate_XDS.INP $NAMES sed -i s"/SPOT_RANGE=1 1/SPOT_RANGE=1 3/" XDS.INP sed -i s"/SPACE_GROUP_NUMBER=0/SPACE_GROUP_NUMBER=1/" XDS.INP sed -i s"/UNIT_CELL_CONSTANTS= 70 80 90/UNIT_CELL_CONSTANTS=38.3 79.1 79.1/" XDS.INP sed -i s"/TRUSTED_REGION=0.0 1.2/TRUSTED_REGION=0 1/" XDS.INP sed -i s"/INCLUDE_RESOLUTION_RANGE=50 0/INCLUDE_RESOLUTION_RANGE=99 1.8/" XDS.INP /usr/local/bin/xds_par cd .. done mkdir xscale cd xscale cat >XSCALE.INP <<eof SPACE_GROUP_NUMBER= 1 UNIT_CELL_CONSTANTS= 38.3 79.1 79.1 90 90 90 OUTPUT_FILE=temp.ahkl SAVE_CORRECTION_IMAGES=FALSE FRIEDEL'S_LAW=TRUE eof find $PWD/../wedge* -name XDS_ASCII.HKL | awk '{print "INPUT_FILE=",$0;print "NBATCH=1 CORRECTIONS=DECAY"}' >> XSCALE.INP
we obtain in P1
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 8.03 3014 908 958 94.8% 44.5% 42.0% 2896 2.55 52.1% 65.0* 3 0.983 231 5.68 5502 1679 1788 93.9% 46.8% 42.5% 5239 2.50 54.8% 50.3* 6 1.001 390 4.64 6996 2164 2292 94.4% 47.5% 42.3% 6656 2.48 55.9% 68.4* 5 1.080 495 4.01 8079 2580 2735 94.3% 48.7% 42.5% 7591 2.38 57.3% 50.0* 2 1.106 557 3.59 9167 2904 3099 93.7% 52.1% 42.7% 8694 2.36 61.7% 43.6* -6 1.017 599 3.28 10276 3226 3397 95.0% 53.3% 43.3% 9728 2.35 62.8% 36.0* 1 1.104 708 3.03 11040 3472 3687 94.2% 54.5% 44.3% 10500 2.17 64.2% 44.4* 2 1.044 728 2.84 12022 3771 3977 94.8% 55.9% 47.2% 11424 1.97 65.8% 36.2* 3 0.999 835 2.68 12705 3985 4227 94.3% 58.5% 51.0% 12065 1.78 68.8% 37.8* -3 0.934 898 2.54 13370 4252 4489 94.7% 59.5% 56.2% 12670 1.61 70.5% 30.1* 4 0.887 869 2.42 14299 4505 4744 95.0% 62.4% 63.6% 13594 1.46 73.7% 30.2* -2 0.824 979 2.32 14835 4647 4915 94.5% 63.8% 70.0% 14083 1.35 75.1% 29.9* -2 0.765 1041 2.23 15599 4917 5181 94.9% 65.7% 72.6% 14809 1.31 77.5% 27.6* -1 0.756 1075 2.15 15888 4965 5272 94.2% 65.1% 78.6% 15117 1.28 76.9% 26.8* -2 0.708 1115 2.07 16872 5324 5601 95.1% 69.1% 88.1% 16035 1.14 81.6% 22.2* 3 0.687 1119 2.01 16856 5349 5649 94.7% 73.4% 92.5% 15988 1.06 86.5% 19.7* -3 0.673 1144 1.95 17842 5666 5976 94.8% 76.7% 105.9% 16959 0.97 90.8% 20.7* -8 0.606 1189 1.89 18102 5767 6069 95.0% 84.4% 127.9% 17152 0.85 99.9% 15.1* -1 0.590 1183 1.84 18633 5933 6256 94.8% 92.8% 162.0% 17667 0.72 109.8% 17.6* 0 0.533 1236 1.80 15519 5405 6479 83.4% 103.0% 194.1% 14280 0.58 122.7% 18.2* 1 0.503 940 total 256616 81419 86791 93.8% 54.3% 51.3% 243147 1.43 64.0% 64.6* 0 0.788 17331
and feed this to pointless:
pointless xdsin temp.ahkl
Scores for each symmetry element Nelmt Lklhd Z-cc CC N Rmeas Symmetry & operator (in Lattice Cell) 1 0.854 5.41 0.54 801 0.706 identity 2 0.842 4.62 0.46 785 0.819 ** 2-fold l ( 0 0 1) {-h,-k,l} 3 0.867 5.13 0.51 746 0.912 ** 2-fold k ( 0 1 0) {-h,k,-l} 4 0.837 5.64 0.56 735 0.807 ** 2-fold h ( 1 0 0) {h,-k,-l} 5 0.869 4.96 0.50 742 0.757 ** 2-fold ( 1-1 0) {-k,-h,-l} 6 0.846 5.52 0.55 719 0.789 ** 2-fold ( 1 1 0) {k,h,-l} 7 0.852 5.44 0.54 1325 1.146 ** 4-fold l ( 0 0 1) {-k,h,l}{k,-h,l} ... ... Best Solution: space group P 42 21 2 Reindex operator: [k,l,h] Laue group probability: 0.989 Systematic absence probability: 0.915 Total probability: 0.905 Space group confidence: 0.874 Laue group confidence 0.986 Unit cell: 79.10 79.10 38.30 90.00 90.00 90.00 79.10 to 13.70 - Resolution range used for Laue group search 79.10 to 1.80 - Resolution range in file, used for systematic absence check Number of batches in file: 3 The data do not appear to be twinned, from the L-test $$ <!--SUMMARY_END--> HKLIN spacegroup: P 1 primitive triclinic $TEXT:Warning:$$ $$ The input crystal system is primitive triclinic (Cell: 38.30 79.10 79.10 90.00 90.00 90.00) The crystal system chosen for output is primitive tetragonal (Cell: 79.10 79.10 38.30 90.00 90.00 90.00)
Based on the P4(2)2(1)2 suggestion, we may try to modify the header of XSCALE.INP to SPACE_GROUP_NUMBER= 94 UNIT_CELL_CONSTANTS= 79.1 79.1 38.3 90 90 90 OUTPUT_FILE=temp.ahkl SAVE_CORRECTION_IMAGES=FALSE FRIEDEL'S_LAW=TRUE REIDX=0 1 0 0 0 0 1 0 1 0 0 0 where the last line takes care of the shuffling of axes into the order k,l,h, and obtain
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 8.03 2978 167 167 100.0% 53.6% 45.8% 2978 5.94 55.1% 99.2* 22 1.190 76 5.68 5488 274 274 100.0% 54.0% 46.1% 5488 6.12 55.4% 97.0* 20 0.915 175 4.64 6976 338 338 100.0% 55.4% 46.1% 6976 6.25 57.0% 99.1* 15 0.983 237 4.01 8069 390 390 100.0% 57.5% 46.3% 8069 6.01 59.0% 93.7* 8 0.991 294 3.59 9191 440 440 100.0% 63.9% 46.7% 9191 5.80 65.5% 89.2* 3 1.071 338 3.28 10239 474 474 100.0% 63.8% 47.0% 10239 5.85 65.4% 89.4* 4 1.119 375 3.03 11037 511 511 100.0% 66.0% 47.5% 11037 5.33 67.6% 91.7* 3 1.068 412 2.84 12014 547 547 100.0% 69.6% 49.1% 12014 4.80 71.2% 82.2* -1 1.092 447 2.68 12698 580 580 100.0% 72.2% 51.0% 12698 4.34 73.9% 83.8* -7 0.969 478 2.54 13360 612 612 100.0% 73.5% 54.1% 13360 3.98 75.3% 73.4* 4 1.025 511 2.42 14299 642 642 100.0% 76.8% 58.2% 14299 3.59 78.6% 57.0* 6 1.016 545 2.32 14827 667 667 100.0% 77.8% 62.3% 14827 3.38 79.6% 70.3* 1 0.924 563 2.23 15588 698 698 100.0% 79.5% 64.6% 15588 3.22 81.3% 64.9* -1 0.914 597 2.15 15888 705 705 100.0% 79.3% 68.0% 15888 3.23 81.1% 52.5* -5 0.882 614 2.07 16867 754 754 100.0% 82.7% 74.7% 16867 2.92 84.6% 50.1* 3 0.920 647 2.01 16847 754 754 100.0% 86.1% 77.3% 16847 2.73 88.1% 47.6* -3 0.839 658 1.95 17842 799 799 100.0% 90.4% 86.7% 17842 2.47 92.4% 49.3* 1 0.822 696 1.89 18095 810 811 99.9% 96.8% 101.2% 18095 2.21 99.1% 44.6* -4 0.773 707 1.84 18633 829 829 100.0% 106.4% 126.3% 18633 1.90 108.9% 39.6* -6 0.730 736 1.80 15510 824 863 95.5% 118.1% 151.4% 15500 1.46 121.2% 32.3* 2 0.688 699 total 256446 11815 11855 99.7% 64.9% 51.6% 256436 3.61 66.5% 97.9* 1 0.910 9805
Analysis with
xscale_isocluster -dim 2 -clu 2 temp.ahkl
yields a iso.pdb which is not at all a single cluster; it is a severely elongated single cloud. We must now investigate whether the data have lower than tetragonal symmetry. XSCALEing with
SPACE_GROUP_NUMBER=16 UNIT_CELL_CONSTANTS=38.3 79.1 79.1 90 90 90
gives a new temp.ahkl, with orthorhombic symmetry.
xscale_isocluster -dim 2 -clu 2 temp.ahkl
gives
psi= 0.1692468 nhalo= 0
cluster: 1 center: 2 elements: 51 core: 51 halo: 0 cluster: 2 center: 6 elements: 49 core: 49 halo: 0 and prepares XSCALE.1.INP (and XSCALE.2.INP for further use.
coot iso.pdb
shows
thus two well separated clouds.
Using XSCALE.1.INP with its 51 XDS_ASCII.HKL, and changing !INCLUDE RESOLUTION_RANGE= 0 0 to FRIEDEL'S_LAW=TRUE, we get
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 8.03 1493 297 306 97.1% 11.8% 23.7% 1467 6.04 13.0% 98.2* 52* 0.662 123 5.68 2829 514 521 98.7% 18.9% 24.2% 2796 5.98 20.9% 96.1* 26* 0.778 258 4.64 3576 638 646 98.8% 23.3% 24.2% 3554 6.07 25.7% 93.3* 12 0.829 346 4.01 4140 748 756 98.9% 28.2% 24.5% 4105 5.84 31.0% 89.4* -5 0.818 418 3.59 4735 838 852 98.4% 30.9% 25.0% 4709 5.72 33.9% 86.7* 5 0.983 470 3.28 5268 912 921 99.0% 34.7% 25.8% 5228 5.52 38.0% 85.9* 0 1.005 533 3.03 5664 982 994 98.8% 37.8% 27.4% 5634 4.90 41.4% 82.1* 4 1.031 563 2.84 6114 1065 1068 99.7% 40.4% 31.7% 6082 4.13 44.4% 82.5* 5 0.963 613 2.68 6486 1127 1133 99.5% 44.5% 37.2% 6450 3.54 48.9% 74.8* 1 0.824 644 2.54 6819 1188 1197 99.2% 48.2% 44.6% 6784 3.01 53.0% 70.4* 1 0.816 709 2.42 7278 1249 1259 99.2% 51.9% 54.7% 7249 2.56 56.9% 70.6* 4 0.751 756 2.32 7595 1297 1304 99.5% 55.9% 63.4% 7555 2.26 61.5% 58.5* 4 0.729 809 2.23 7943 1361 1371 99.3% 57.8% 66.4% 7903 2.16 63.3% 63.5* -3 0.687 844 2.15 8093 1375 1385 99.3% 60.1% 75.4% 8054 2.03 65.9% 66.7* 3 0.664 860 2.07 8561 1476 1482 99.6% 64.8% 88.3% 8512 1.76 71.1% 53.0* 7 0.640 914 2.01 8613 1473 1482 99.4% 68.3% 95.8% 8570 1.60 74.9% 60.6* -1 0.628 928 1.95 9048 1566 1571 99.7% 73.1% 112.2% 9004 1.41 80.2% 56.7* -3 0.571 966 1.89 9236 1580 1593 99.2% 82.6% 142.1% 9204 1.19 90.8% 56.3* -5 0.504 1000 1.84 9467 1618 1631 99.2% 92.8% 180.0% 9432 0.96 101.9% 43.2* 4 0.467 1007 1.80 7927 1570 1701 92.3% 104.8% 225.2% 7811 0.70 116.1% 42.6* -5 0.425 785 total 130885 22874 23173 98.7% 38.3% 41.0% 130103 2.77 42.1% 92.0* 3 0.703 13546
At this point, we run
xdscc12 -w XSCALE.1.HKL | grep ^a | sort -nk6
and find that data sets 1 and 17 are wrongly included in the cloud of 51 data sets. Thus they are removed manually from XSCALE.INP. We then re-run XSCALE with MERGE=TRUE. The resulting XSCALE.1.HKL is then used as REFERENCE_DATA_SET for a second round of integration with XDS.
pointless xdsin XSCALE.1.HKL
gives
Spacegroup TotProb SysAbsProb Reindex Conditions P 21 21 21 ( 19) 0.896 0.924 h00: h=2n, 0k0: k=2n, 00l: l=2n (zones 1,2,3) .......... P 2 21 21 ( 18) 0.044 0.045 0k0: k=2n, 00l: l=2n (zones 2,3) .......... P 21 21 2 ( 18) 0.015 0.015 h00: h=2n, 0k0: k=2n (zones 1,2) .......... P 21 2 21 ( 18) 0.014 0.014 h00: h=2n, 00l: l=2n (zones 1,3) --------------------------------------------------------------- Space group confidence (= Sqrt(Score * (Score - NextBestScore))) = 0.87 Laue group confidence (= Sqrt(Score * (Score - NextBestScore))) = 0.97 Selecting space group P 21 21 21 as there is a single space group with the highest score <!--SUMMARY_BEGIN--> $TEXT:Result: $$ $$ Best Solution: space group P 21 21 21 Reindex operator: [h,k,l] Laue group probability: 0.970 Systematic absence probability: 0.924 Total probability: 0.896 Space group confidence: 0.874 Laue group confidence 0.966 Unit cell: 38.30 79.10 79.10 90.00 90.00 90.00 79.10 to 2.47 - Resolution range used for Laue group search 79.10 to 1.80 - Resolution range in file, used for systematic absence check
thus we now know the spacegroup.
Round 2: using the REFERENCE_DATA_SET
The processing script integrate.rc is changed a bit:
#!/bin/bash -f for f in `seq 1 100`; do export OUT=wedge0`printf "%03d" $f` export NAMES="$PWD/Illuin/microfocus/xtal"`printf "%03d" $f`"_1_00\?.img" rm -rf $OUT mkdir $OUT cd $OUT generate_XDS.INP $NAMES echo REFERENCE_DATA_SET=../reference.hkl >> XDS.INP echo MINIMUM_I/SIGMA=50 >>XDS.INP sed -i s"/SPOT_RANGE=1 1/SPOT_RANGE=1 3/" XDS.INP sed -i s"/SPACE_GROUP_NUMBER=0/SPACE_GROUP_NUMBER=19/" XDS.INP sed -i s"/UNIT_CELL_CONSTANTS= 70 80 90/UNIT_CELL_CONSTANTS=38.3 79.1 79.1/" XDS.INP sed -i s"/TRUSTED_REGION=0.0 1.2/TRUSTED_REGION=0 1/" XDS.INP sed -i s"/INCLUDE_RESOLUTION_RANGE=50 0/INCLUDE_RESOLUTION_RANGE=99 1.8/" XDS.INP /usr/local/bin/xds_par cd .. done mkdir xscale cd xscale cat >XSCALE.INP <<eof SPACE_GROUP_NUMBER= 19 UNIT_CELL_CONSTANTS= 38.3 79.1 79.1 90 90 90 OUTPUT_FILE=temp.ahkl SAVE_CORRECTION_IMAGES=FALSE eof find $PWD/../wedge* -name XDS_ASCII.HKL | awk '{print "INPUT_FILE=",$0;print "NBATCH=3 CORRECTIONS=ALL"}' >> XSCALE.INP
and we get as XSCALE.LP :
NOTE: Friedel pairs are treated as different reflections. SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 8.04 2960 473 476 99.4% 6.2% 5.5% 2955 29.90 6.7% 99.8* 86* 2.824 166 5.68 5486 890 894 99.6% 4.9% 5.9% 5478 27.38 5.3% 99.7* 86* 2.384 363 4.64 6934 1136 1138 99.8% 4.9% 5.8% 6918 27.64 5.4% 99.8* 76* 1.829 480 4.02 8066 1363 1367 99.7% 5.3% 5.9% 8045 26.67 5.9% 99.6* 57* 1.426 590 3.59 9121 1535 1539 99.7% 6.1% 6.3% 9092 25.58 6.7% 99.6* 50* 1.298 666 3.28 10222 1690 1694 99.8% 6.8% 6.8% 10203 24.69 7.5% 99.4* 36* 1.204 751 3.04 10990 1831 1834 99.8% 8.5% 8.0% 10970 21.40 9.3% 99.3* 22* 1.086 827 2.84 12065 1993 1999 99.7% 11.2% 11.1% 12038 17.68 12.2% 99.0* 24* 1.085 894 2.68 12771 2120 2124 99.8% 14.7% 15.1% 12738 14.78 16.1% 98.4* 14* 0.960 952 2.54 13054 2196 2198 99.9% 18.9% 20.2% 13026 12.53 20.8% 97.7* 13* 0.867 995 2.42 14290 2372 2375 99.9% 24.9% 27.1% 14261 10.34 27.3% 96.1* 6 0.813 1083 2.32 14704 2432 2438 99.8% 29.8% 32.5% 14676 9.21 32.6% 95.1* 8 0.843 1115 2.23 15623 2582 2593 99.6% 33.0% 35.0% 15587 8.83 36.1% 93.0* 6 0.831 1180 2.15 15732 2610 2613 99.9% 37.1% 39.2% 15697 8.10 40.6% 91.0* 8 0.818 1203 2.08 16782 2788 2795 99.7% 44.1% 47.0% 16741 7.01 48.3% 88.3* 4 0.797 1276 2.01 16783 2802 2809 99.8% 46.8% 48.7% 16747 6.54 51.2% 89.5* 3 0.807 1293 1.95 18262 3043 3051 99.7% 56.5% 58.0% 18221 5.61 61.9% 85.9* 0 0.803 1402 1.89 17810 2979 2988 99.7% 68.3% 69.8% 17769 4.63 74.8% 80.0* 7 0.864 1374 1.84 18503 3112 3117 99.8% 87.5% 90.3% 18454 3.55 96.0% 69.6* 3 0.838 1435 1.80 16130 2988 3185 93.8% 101.2% 110.5% 15959 2.77 111.7% 62.9* 2 0.798 1276 total 256288 42935 43227 99.3% 13.4% 14.0% 255575 11.63 14.6% 99.6* 21* 0.975 19321
The structure can now easily be solved with hkl2map!