1,330
edits
No edit summary |
|||
Line 1: | Line 1: | ||
The following is based on the experience of a protein crystallographer who one day obtained a small-molecule dataset and managed to solve and refine it without prior knowledge what the crystallized substance was. It was a very rewarding experience which is why it's written up here. | The following is based on the experience of a protein crystallographer who one day obtained a small-molecule dataset and managed to solve and refine it without prior knowledge what the crystallized substance was, and without experience in small-molecule crystallography. It was a very rewarding experience (see the figure at the bottom) which is why it's written up here. | ||
This is just a case study. To understand things, one has to read http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf . | This is just a case study. To understand things, one has to read http://shelx.uni-ac.gwdg.de/SHELX/shelx.pdf . | ||
== | == Reduce the data with your favourite data processing software == | ||
I use [[xds:Main_Page|XDS]]. The decision about the spacegroup has to be postponed, but it surely helps if the correct Laue group is employed during scaling. In the case considered here, the CORRECT step suggested P222 (XDS really only should suggest "222 point symmetry" because CORRECT does not look at systematic absences at this point). | I use [[xds:Main_Page|XDS]]. The decision about the spacegroup has to be postponed, but it surely helps if the correct Laue group is employed during scaling. In the case considered here, the CORRECT step suggested P222 (XDS really only should suggest "222 point symmetry" because CORRECT does not look at systematic absences at this point). | ||
== convert the reflection file to HKLF 4 format (intensities!) | == Determine the spacegroup == | ||
The HKLF 4 format is what the SHELX programs read. I used [[xds:XDSCONV|XDSCONV]] and the following | |||
If there are different spacegroup possibilities then (downstream, in structure solution and refinement) we need to try all of them in turn, until we hit one that refines really satisfactorily (R-factor below, say, 5%) and gives a structure that makes sense. | |||
=== use [[XPREP]] to find out possible spacegroups === | |||
First, convert the reflection file to HKLF 4 format (intensities!). The HKLF 4 format is what the SHELX programs read. I used [[xds:XDSCONV|XDSCONV]] and the following XDSCONV.INP: | |||
SPACE_GROUP_NUMBER= 1 | |||
UNIT_CELL_CONSTANTS= 14.433 28.704 8.488 90.000 90.000 90.000 | |||
INPUT_FILE=XDS_ASCII.HKL | INPUT_FILE=XDS_ASCII.HKL | ||
OUTPUT_FILE=temp.hkl | OUTPUT_FILE=temp.hkl | ||
It is important that - to preserve the full information about systematic absences, for use in [[XPREP]] - XDSCONV runs in spacegroup 1. This does not necessarily mean that CORRECT also has to run in spacegroup 1, because XDS_ASCII.HKL has all observations no matter in which spacegroup the CORRECT step runs. As long as the spacegroup used in the CORRECT step is primitive, this works nicely. But if some re-indexing between CORRECT's spacegroup and P1 is necessary (like in I, F, C, R) then it is probably safest to rather just run CORRECT in P1. | |||
answer the question concerning the cell axes, and then hit <Enter> several (about 6) times until the program suggests a list of spacegroups - this choice is going to be important. It may help to observe whether it's centrosymmetric or not, from the line: Mean |E*E-1| = 0.939 [expected .968 centrosym and .736 non-centrosym]. Fortunately there's only one spacegroup consistent with the data: | |||
answer the question concerning the cell axes, and then hit <Enter> several times until the program suggests a list of spacegroups - this choice is going to be important. It | |||
<pre> | <pre> | ||
SPACE GROUP DETERMINATION | |||
Lattice exceptions: P A B C I F Obv Rev All | |||
N (total) = 0 28832 28824 28788 28823 43222 38376 38344 57564 | |||
N (int>3sigma) = 0 17961 18421 18158 17862 27270 24715 24627 36959 | |||
Mean intensity = 0.0 22.7 23.7 24.8 23.4 23.7 24.7 24.8 24.8 | |||
Mean int/sigma = 0.0 9.6 10.0 9.9 9.6 9.8 10.0 10.0 10.0 | |||
Crystal system O and Lattice type P selected | |||
Mean |E*E-1| = 0.939 [expected .968 centrosym and .736 non-centrosym] | |||
Chiral flag NOT set | |||
Systematic absence exceptions: | Systematic absence exceptions: | ||
b-- c-- n-- 21-- -c- -a- -n- -21- --a --b --n --21 | b-- c-- n-- 21-- -c- -a- -n- -21- --a --b --n --21 | ||
N | N 1884 1884 1892 7 988 1014 992 28 545 541 534 72 | ||
N I>3s 706 706 0 0 304 0 304 0 0 203 203 0 | N I>3s 706 706 0 0 304 0 304 0 0 203 203 0 | ||
<I> | <I> 25.2 25.2 0.5 0.0 18.2 0.4 18.1 0.4 0.4 25.0 25.4 0.4 | ||
<I/s> | <I/s> 7.3 7.3 0.5 0.2 6.6 0.5 6.6 0.5 0.4 7.4 7.6 0.4 | ||
Line 29: | Line 54: | ||
Option Space Group No. Type Axes CSD R(sym) N(eq) Syst. Abs. CFOM | Option Space Group No. Type Axes CSD R(sym) N(eq) Syst. Abs. CFOM | ||
[A] | [A] Pccn # 56 centro 3 196 0.023 10123 0.5 / 6.6 2.23 | ||
Option [ | Option [A] chosen | ||
</pre> | </pre> | ||
After that, say "c" for "define unit-cell CONTENTS", and input a reasonable number of carbon atoms (I used C20). Get out of this menu with "E". Then, choose "f" for "set up shelxtl FILES". Then, answer the question "XM/SHELXD (M) or XS/SHELXS (S) format [S]:" with "m" since we're going to use shelxd for solving the structure. Answer the question about the name (I used the spacegroup number as I knew I would have to test several possibilities). Finally, "q"uit the program. | After that, say "c" for "define unit-cell CONTENTS", and input a reasonable number of carbon atoms (I used C20). Get out of this menu with "E". Then, choose "f" for "set up shelxtl FILES". Then, answer the question "XM/SHELXD (M) or XS/SHELXS (S) format [S]:" with "m" since we're going to use shelxd for solving the structure. Answer the question about the name (I used the spacegroup number as I knew I would have to test several possibilities). Finally, "q"uit the program. This writes 56.ins : | ||
TITL | TITL 56 in Pccn | ||
CELL 0.71073 | CELL 0.71073 14.4330 28.7040 8.4880 90.000 90.000 90.000 | ||
ZERR 11.00 0. | ZERR 11.00 0.0029 0.0057 0.0017 0.000 0.000 0.000 | ||
LATT 1 | LATT 1 | ||
SYMM 0.5-X, -Y, | SYMM 0.5-X, 0.5-Y, Z | ||
SYMM -X, 0.5+Y, -Z | SYMM -X, 0.5+Y, 0.5-Z | ||
SYMM 0.5+X, | SYMM 0.5+X, -Y, 0.5-Z | ||
SFAC C | SFAC C | ||
UNIT 220 | UNIT 220 | ||
Line 79: | Line 76: | ||
END | END | ||
== | == Solve the structure with [[SHELX C/D/E|SHELXD]] == | ||
Just run "shelxd | Just run "shelxd 56". You may interrupt it with Ctrl-C once it has found a good solution, as suggested by | ||
Try | Try 11:20 Peaks 99 92 87 87 87 83 77 73 71 70 68 68 64 64 64 63 62 62 61 60 | ||
R = 0. | R = 0.294, Min.fun. = 0.747, <cos> = 0.491, Ra = 0.235 | ||
Try | Try 11, CC All/Weak 59.81 / 46.01, best 59.81 / 46.01, best final CC 0.00 | ||
Peaklist optimization cycle 1 CC = | Peaklist optimization cycle 1 CC = 77.51 % BG = 0.322 for 22 atoms | ||
Peaks: 99 | Peaks: 99 90 87 85 82 77 75 74 66 64 64 64 63 63 62 57 39 39 36 36 33 31 | ||
Fragments: | Fragments: 17 5 | ||
Peaklist optimization cycle 2 CC = | Peaklist optimization cycle 2 CC = 88.80 % BG = 0.225 for 25 atoms | ||
Peaks: 99 | Peaks: 99 95 89 88 87 84 82 79 78 78 77 76 75 75 74 73 73 71 71 69 67 65 40 | ||
Fragments: | Fragments: 25 | ||
Peaklist optimization cycle 3 CC = | Peaklist optimization cycle 3 CC = 88.85 % BG = 0.223 for 25 atoms | ||
Peaks: 99 | Peaks: 99 96 89 87 86 86 82 79 79 76 76 75 75 75 73 73 72 71 69 69 67 65 63 | ||
Fragments: | Fragments: 25 | ||
The resulting 56.res is: | |||
<pre> | <pre> | ||
REM TRY | REM TRY 23 FINAL CC 88.85 TIME 3 SECS | ||
REM Fragments: | REM Fragments: 25 | ||
REM | REM | ||
TITL | TITL 56 in Pccn | ||
CELL 0.71073 | CELL 0.71073 14.4330 28.7040 8.4880 90.000 90.000 90.000 | ||
ZERR 11.00 0. | ZERR 11.00 0.0029 0.0057 0.0017 0.000 0.000 0.000 | ||
LATT 1 | LATT 1 | ||
SYMM 0.5-X, -Y, | SYMM 0.5-X, 0.5-Y, Z | ||
SYMM -X, 0.5+Y, -Z | SYMM -X, 0.5+Y, 0.5-Z | ||
SYMM 0.5+X, | SYMM 0.5+X, -Y, 0.5-Z | ||
SFAC C | SFAC C | ||
UNIT 220 | UNIT 220 | ||
C001 1 0. | C001 1 0.45835 0.41566 0.09083 11.00000 0.1 99.00 | ||
C002 1 0. | C002 1 0.36894 0.55007 -0.58932 11.00000 0.1 95.84 | ||
C003 1 0. | C003 1 0.52129 0.72099 -0.95623 11.00000 0.1 89.35 | ||
C004 1 0. | C004 1 0.67521 0.30725 0.04587 11.00000 0.1 87.55 | ||
C005 1 0. | C005 1 0.40328 0.54911 -0.45947 11.00000 0.1 85.96 | ||
... | ... | ||
C021 1 0.60567 0.70055 -0.97749 11.00000 0.1 66.94 | |||
C022 1 0.49503 0.62079 -0.48787 11.00000 0.1 64.91 | |||
C023 1 0.60066 0.62034 -0.48599 11.00000 0.1 63.62 | |||
C024 1 0.63251 0.26331 0.06189 11.00000 0.1 63.01 | |||
C025 1 0.47217 0.73227 -1.09548 11.00000 0.1 61.79 | |||
HKLF 4 | HKLF 4 | ||
END | END | ||
</pre> | </pre> | ||
== | == Refine using [[SHELXL]] == | ||
Insert | Copy 56.res to 56.ins. Insert | ||
ACTA | ACTA | ||
LIST 6 | LIST 6 | ||
L.S. 10 | L.S. 10 | ||
after the UNIT 220 instruction, and run "shelxl | after the UNIT 220 instruction, and run "shelxl 56". This gives a first refined model, and its electron density map, plus the relevant statistics. | ||
=== general idea of refining a structure === | === general idea of refining a structure === | ||
Line 158: | Line 147: | ||
For the H atoms, we just cut-and-paste the atoms from the bottom of the .res file into those lines where the other atoms are, if the distances to existing (heavy) atoms are close to 1 A. | For the H atoms, we just cut-and-paste the atoms from the bottom of the .res file into those lines where the other atoms are, if the distances to existing (heavy) atoms are close to 1 A. | ||
== Finishing the structure == | === Finishing the structure === | ||
Finally we switch to anisotropic refinement by putting an | Finally we switch to anisotropic refinement by putting an |