SHELXL: Difference between revisions

Jump to navigation Jump to search
4,138 bytes added ,  2 June 2015
m
No edit summary
 
(23 intermediate revisions by 3 users not shown)
Line 1: Line 1:
For the other [http://shelx.uni-ac.gwdg.de/SHELX/ SHELX] programs featured in this Wiki, see [[SHELX C/D/E]] !
<br>
== Refinement of macromolecules with SHELXL ==
== Refinement of macromolecules with SHELXL ==


SHELXL is a very general crystal structure refinement program that is equally suitable for the refinement of minerals, organometallic structures, oligonucleotides, or proteins (or any mixture thereof) against X-ray or neutron single (or twinned!) crystal data. The price of this generality is that it is somewhat slower than programs specifically written only for protein structure refinement, on the other hand a multiple-CPU version (adapted by Kay Diederichs) compensates for this. Any protein- (or DNA-) specific information must be input to SHELXL by the user in the form of refinement restraints, etc. Refinement of macromolecules using SHELXL has been discussed by Sheldrick & Schneider (1997).<br>
SHELXL is a very general crystal structure refinement program that is equally suitable for the refinement of minerals, organometallic structures, oligonucleotides, or proteins (or any mixture thereof) against X-ray or neutron single (or twinned!) crystal data. The price of this generality is that it is somewhat slower than programs specifically written only for protein structure refinement, on the other hand a multiple-CPU version (adapted by Kay Diederichs) compensates for this. Any protein- (or DNA-) specific information must be input to SHELXL by the user in the form of refinement restraints, etc. <br>
 
Despite this generality, it must be emphasized that SHELXL is not suitable for refinements at resolutions lower than about 2.0 Å because, unlike [[Refmac]] and [[phenux.refne]], it does not provide (side-chain) torsion angle restraints, and that a least-squares refinement program such as SHELXL will suffer more from model bias than a program based on maximum likelihood. Also the Babinet bulk solvent model used in SHELXL is in need of improvement. Almost always the initial refinement will have been performed with another program and SHELXL will be used for the final refinement, perhaps involving extension to very high resolution, modeling of disorder, anisotropic refinement and the least-squares estimation of parameter errors. Thus the starting point for a SHELXL refinement will usually be a PDB format file from the previous refinement. Even when SHELXL has to be used for the refinement of a non-merohedrally twinned structure at lower resolution, the starting model is likely to be in the form of a PDB file from a molecular replacement solution.<br>


Despite this generality, it must be emphasized that SHELXL is not suitable for refinements at resolutions lower than about 2.0 Å because, unlike [[Refmac]] and [[PHENIX|phenix.refine]], it does not provide (side-chain) torsion angle restraints, and that a least-squares refinement program such as SHELXL will suffer more from model bias than a program based on maximumlikelihood. Also the Babinet bulk solvent model used in SHELXL is in need of improvement. Almost always the initial refinement will have been performed with another program and SHELXL will be used for the final refinement, perhaps involving extension to very high resolution, modeling of disorder, anisotropic refinement and the least-squares estimation of parameter errors. Thus the starting point for a SHELXL refinement will usually be a PDB format file from the previous refinement. Even when SHELXL has to be used for the refinement of a non-merohedrally twinned structure at lower resolution, the starting model is likely to be in the form of a PDB file from a molecular replacement solution.<br>


== Input files for SHELXL ==
== Input files for SHELXL ==
Line 10: Line 12:
SHELXL usually requires two input files: an .ins file containing crystal data, instructions and atoms, and an .hkl file containing h, k, l, F<sup>2</sup> and &sigma;(F<sup>2</sup>) in fixed ‘HKLF 4’ format [alternatively F and &sigma;(F) may input; this requires the instruction ‘HKLF 3’]. The .ins file will usually be generated from a PDB format file using the ‘I’ option in SHELXPRO. This sets up the TITL...UNIT instructions followed by standard refinement instructions, restraints, instructions for generating hydrogen atoms (commented out until needed) and atoms in '''''crystal coordinates'''''. For residues other than the 20 standard amino-acids, suitable restraints (see below) must be added by hand (see below). The ‘I’ option in SHELXPRO provides a way of renumbering the residues; since SHELXL does not (currently) recognize chain identifiers, chains must be emulated by (for example) adding 1000, 2000 etc. to the residue numbers. SHELXPRO can also perform the reverse operation when preparing a PDB file for deposition (the ‘B’ option). After each refinement job, the output .res file is edited or renamed to a new .ins file that serves as the input for the next refinement job. The updating of the .res file to .ins may also be performed by ‘U’ option in SHELXPRO; do not use the "I" option and the .pdb file for this, because all the special instructions in the .ins file will be lost.<br>
SHELXL usually requires two input files: an .ins file containing crystal data, instructions and atoms, and an .hkl file containing h, k, l, F<sup>2</sup> and &sigma;(F<sup>2</sup>) in fixed ‘HKLF 4’ format [alternatively F and &sigma;(F) may input; this requires the instruction ‘HKLF 3’]. The .ins file will usually be generated from a PDB format file using the ‘I’ option in SHELXPRO. This sets up the TITL...UNIT instructions followed by standard refinement instructions, restraints, instructions for generating hydrogen atoms (commented out until needed) and atoms in '''''crystal coordinates'''''. For residues other than the 20 standard amino-acids, suitable restraints (see below) must be added by hand (see below). The ‘I’ option in SHELXPRO provides a way of renumbering the residues; since SHELXL does not (currently) recognize chain identifiers, chains must be emulated by (for example) adding 1000, 2000 etc. to the residue numbers. SHELXPRO can also perform the reverse operation when preparing a PDB file for deposition (the ‘B’ option). After each refinement job, the output .res file is edited or renamed to a new .ins file that serves as the input for the next refinement job. The updating of the .res file to .ins may also be performed by ‘U’ option in SHELXPRO; do not use the "I" option and the .pdb file for this, because all the special instructions in the .ins file will be lost.<br>


The .hkl file contains the reflection intensity data. It is not necessary to sort the data, eliminate systematic absences or merge equivalents, SHELXL can do this anyway. If it is desired to refine (using complex scattering factors) against separate F<sup>2</sup>-values for h,k,l and –h,-k,-l some care is needed; there are problems using data processing software (such as CCP4) that does not keep these measurements separate, and ‘MERG 2’ must be specified in the .ins file to prevent SHELXL from merging the Friedel opposites (and setting all f” values to zero). A further problem on continuing a refinement started with another program is to ensure consistent flagging of the free-R reflections. For this reason it is strongly recommended that Tim Gr&uuml;ne's program [[mtz2hkl]] (available from the SHELX download site) is used for this conversion. The Bruker XPREP program provides general facilities for setting Rfree flags and for transferring and extending free-R flags consistently from one reflection file to another taking space group symmetry into account. When twinning or NCS are present, it is better to flag thin resolution shells, otherwise random reflections should be flagged.
The .hkl file contains the reflection intensity data. It is not necessary to sort the data, eliminate systematic absences or merge equivalents, SHELXL can do this anyway. If it is desired to refine (using complex scattering factors) against separate F<sup>2</sup>-values for h,k,l and –h,-k,-l some care is needed; there are problems using data processing software (such as CCP4) that does not keep these measurements separate, and ‘MERG 2’ must be specified in the .ins file to prevent SHELXL from merging the Friedel opposites (and setting all f” values to zero). A further problem on continuing a refinement started with another program is to ensure consistent flagging of the free-R reflections. For this reason it is strongly recommended that Tim Gr&uuml;ne's program [[mtz2hkl]] is used for this conversion. The Bruker [[XPREP]] program provides general facilities for setting Rfree flags and for transferring and extending free-R flags consistently from one reflection file to another taking space group symmetry into account. When twinning or NCS are present, it is better to flag thin resolution shells, otherwise random reflections should be flagged.<br>


== SHELXL Output files ==


== SHELXL Output files ==
SHELXL writes a updated parameter file with the extension .res in the same format as the input .ins file, and an output .fcf file containing phased reflection data in CIF format. This file can be used for depositing the reflection data with the PDB, and both the .res and the .fcf file can be read by Coot to enable the refined atoms and &sigma;<sub>A</sub>-weighted maps to be displayed directly.<br>


SHELXL writes a updated parameter file with the extension .res in the same format as the input .ins file, a .pdb file with the new atom coordinates (unfortunately one has to add the space group to the CRYST1 record before Coot can read this file) and an output .fcf file containing phased reflection data in CIF format. This file can be used for depositing the reflection data with the PDB, and both the .res and the .fcf file can be read by Coot to enable the refined atoms and &sigma;<sub>A</sub>-weighted maps to be displayed directly. <br>




Line 40: Line 42:
<b>SADI_54 0.04 FE SG_6 FE SG_9 FE SG_39 FE SG_42</b><br>     
<b>SADI_54 0.04 FE SG_6 FE SG_9 FE SG_39 FE SG_42</b><br>     


restrains the bond lengths in the FeS<sub>4</sub> unit to be equal, but without a target value, with an esd of 0.04Å. The central iron atom is in residue number 54 and the four cystein sulfurs are all in different residues.<br>
restrains the bond lengths in the FeS<sub>4</sub> unit to be equal, but without a target value, with an esd of 0.04 Å. The central iron atom is in residue number 54 and the four cystein sulfurs are all in different residues. The SADI 'similar distance' restraints provide a convenient way of restaining all sulfate ions in the structure to be regular tetrahedra with approximately equal S-O distances: <br>
 
<b>SADI_SO4 S O1 S O1 S O3 S O4<br>
SADI_SO4 O1 O2 O1 O3 O1 O4 O2 O3 O2 O4 O3 O4</b><br>
 
For a disordered sufate on a symmetry axis it may be necessary to use the EQIV instruction to enable symmetry equivalent to be included in such restraints (explained in Sheldrick (2008) ''Acta Crystallogr''. '''A64''', 112-122).<br>  


<b>FLAT_* 0.3 O_- CA_- N C_- CA</b><br>     
<b>FLAT_* 0.3 O_- CA_- N C_- CA</b><br>     
Line 47: Line 54:


The PRODRG server: http://davapc1.bioch.dundee.ac.uk/programs/prodrg/ is recommended for generating restraints in SHELX format for ligands etc; the "J" option in SHELXPRO can also be useful for this if a model is already available. File of DNA and RNA restraints are available from the SHELX download site.<br>
The PRODRG server: http://davapc1.bioch.dundee.ac.uk/programs/prodrg/ is recommended for generating restraints in SHELX format for ligands etc; the "J" option in SHELXPRO can also be useful for this if a model is already available. File of DNA and RNA restraints are available from the SHELX download site.<br>
== Chiral volumes ==
SHELXL defines a chiral volume as the volume of the 'unit-cell' that can be constructed using the three interatomic vectors from the atom in question; this can be calculated as a determinant using orthogonal cartesian coordinates. SHELXL restricts chiral volumes to cases where an atom makes exactly three bonds to other non-hydrogen atoms; hydrogen atoms are ignored. The sign is determined by evaluating the determinant with the rows representing the three vectors in the order of their ASCII codes, and so is independent of the order of the atoms in the input file. This means that the alpha carbon in the 19 standard chiral L-amino-acids will always have a chiral volume of about +2.5 Å<sup>3</sup> (using the Cahn-Ingold-Prelog R and S convention would have required L-Cys to have the opposite sign). CB of Ile has a chiral volume of 2.495 but CB of Thr is -2.628. However the CHIV instruction in SHELXL also has other uses, e.g.
<b>CHIV_VAL C</b><br>
<b>CHIV_VAL 2.516 CA</b><br>
<b>CHIV_VAL -2.622 CB</b>
This restrains the chiral volume of the carbonyl carbon to be zero (the default) with a default esd (0.1 Å<sup>3</sup>), i.e. restrains it to be planar. CB is not chiral for valine, but the above restraint makes sure that CG1 and CG2 are named conventionally (the RSCB now use this idea to check the naming of H-atoms in -CH<sub>2</sub>- groups, which is one of the reasons why the hydrogens should be removed before depositing the structure (they are always recalculated anyway before use, e.g. by MolProbity). And if you wanted all the alpha-carbons for the alanines to have the same chiral volume but would like to refine its value, a SHELXL 'free-variable' can be used (here #3):
<b>CHIV_ALA 31 CA</b>
(i.e. 1*fv(3)); if there is a D-Ala in the structure as well:
<b>CHIV_DAL 29 CA</b>
(i.e. -1*fv(3)).<br>




Line 76: Line 104:
== Modeling disorder ==
== Modeling disorder ==


There are many ways of modeling disorder using SHELXL, but for macromolecules the most convenient is to retain the same atom and residue names for the two or more components and assign a different "part number" (analogous to the PDB alternative site flag) to each component. With this technique, no change is required to the input restraints, etc. Atoms in the same component will normally have a common occupancy that is assigned to a free variable (fv). The starting values for the free variables are given, in order, on the FVAR instruction; note that there is no free variable number 1 (adding 10 fixes a parameter); the first FVAR parameter is the overall scale factor. Residues Glu_12 and Cys_38 have disordered side-chains in the example; their occupancies are tied to fv(2) (for the atoms in component [PART] 1) and to 1-fv(2) for the atoms in component 2 for Glu_12, and similarly fv(4) and 1-fv(4) for Cys_38. This ensures that the sum of occupancies for both components is held at unity. ’21.0’ is interpreted as 1.0 times fv(2), and –21.0 as 1.0 times [1-fv(2)].  
There are many ways of modeling disorder using SHELXL, but for macromolecules the most convenient is to retain the same atom and residue names for the two or more components and assign a different "part number" (analogous to the PDB alternative site flag) to each component. With this technique, no change is required to the input restraints, etc. Atoms in the same component will normally have a common occupancy that is assigned to a free variable (fv). The starting values for the free variables are given, in order, on the FVAR instruction; note that there is no free variable number 1 (adding 10 fixes a parameter); the first FVAR parameter is the overall scale factor. Residues Glu_12 and Cys_38 have disordered side-chains in the example; their occupancies are tied to fv(2) (for the atoms in component [PART] 1) and to 1-fv(2) for the atoms in component 2 for Glu_12, and similarly fv(4) and 1-fv(4) for Cys_38. This ensures that the sum of occupancies for both components is held at unity. ’21.0’ is interpreted as 1.0 times fv(2), and –21.0 as 1.0 times [1-fv(2)]. This notation is not very intuitive, but it is concise and very flexible. A common example is the use of a single free variable to describe the occupancies of all the atoms in both components of a disordered sidechain, e.g.<br>
This notation is not very intuitive, but it is concise and very flexible. Free variables may also be used in DFIX and CHIV restraints. Thus ’CHIV_PRO 31 CA’ would cause the chiral volumes of all proline CA atoms to be restrained to free variable number 3, which itself is allowed to refine. In this way reasonable geometrical restraints can be applied even when the target values are unknown. By restraining distances to be equal to a free variable using DFIX, a standard deviation of the mean distance may be calculated rigorously using full-matrix least-squares algebra.
 
If there are three or more disorder components, then each of the common occupancies must be assigned to a separate free variable (e.g. as 51, 61 and 71), and their sum can be restrained to unity by the use of a SUMP restraint, e.g.:<br>
<b>PART 1<br>
CB  1  ...  ...  ...  31  ...<br>
OG  4  ...  ...  ...  31  ...<br>
PART 2<br>
CB  1  ...  ...  ... -31  ...<br>
OG  4  ...  ...  ... -31 ...<br>
PART 0</b><br>
 
For a disordered serine. The starting value of the occupancy p is given as the third FVAR parameter, the two components will be assigned occupancies p and 1-p. Note  that it is desirable to split CB even if no splitting can be seen in the maps so that when hydrogens are added later with e.g. <br>
 
<b>HFIX_SER 23 CB</b><br>
 
(before the first atom) the correct disordered hydrogens will be generated fully automatically. If there are three or more disorder components, then each of the common occupancies must be assigned to a separate free variable (e.g. as 51, 61 and 71), and their sum can be restrained to unity by the use of a SUMP restraint, e.g.:<br>


<b>SUMP 1 0.01 1 5 1 6 1 7 </b><br>
<b>SUMP 1 0.01 1 5 1 6 1 7 </b><br>
Free variables may also be used in DFIX and CHIV restraints. Thus <br>
<b>CHIV_PRO 31 CA</b><br>
would cause the chiral volumes of all proline CA atoms to be restrained to free variable number 3, which itself is allowed to refine. In this way reasonable geometrical restraints can be applied even when the target values are unknown. By restraining distances to be equal to a free variable using DFIX, a standard deviation of the mean distance may be calculated rigorously using full-matrix least-squares algebra.<br>




Line 86: Line 132:


SHELXL provides facilities for refining against data from merohedral, pseudo-merohedral, and non-merohedral twins (Herbst-Irmer & Sheldrick, 1998). Refinement against data from merohedrally twinned crystals is particularly straightforward, requiring only the twin law (a 3x3 matrix) and starting values for the volume fractions of the twin components. Failure to recognize such twinning not only results in high R-factors and poor quality maps, it can also lead to incorrect biochemical conclusions (Luecke, Richter & Lanyi, 1998). Twinning can often be detected by statistical tests (Yeates & Fam, 1999), and it is probably much more widespread in macromolecular crystals than is generally appreciated!
SHELXL provides facilities for refining against data from merohedral, pseudo-merohedral, and non-merohedral twins (Herbst-Irmer & Sheldrick, 1998). Refinement against data from merohedrally twinned crystals is particularly straightforward, requiring only the twin law (a 3x3 matrix) and starting values for the volume fractions of the twin components. Failure to recognize such twinning not only results in high R-factors and poor quality maps, it can also lead to incorrect biochemical conclusions (Luecke, Richter & Lanyi, 1998). Twinning can often be detected by statistical tests (Yeates & Fam, 1999), and it is probably much more widespread in macromolecular crystals than is generally appreciated!
No changes are needed to the .hkl file for merohedral twinning, but the data should be merged in the lower of the two relevant Laue groups). For non-merohedral twinning a special (‘HKLF 5’) format is required.<br>
No changes are needed to the .hkl file for merohedral twinning, but the data should be merged in the lower of the two relevant Laue groups). For non-merohedral twinning a special (‘HKLF 5’) format is required for the intensity data file.<br>




Line 108: Line 154:
== Obtaining the SHELX programs ==
== Obtaining the SHELX programs ==


SHELXC/D/E and test data may be downloaded from the SHELX fileserver. The application form should be printed out from http://shelx.uni-ac.gwdg.de/SHELX/ This form should be completed and faxed to +49-551-392582.  Downloading instructions will then be emailed to the address given on the form, so please write the email address CLEARLY.  The programs are free to academics but a small license fee is required for 'for-profit' use.  <br>
SHELXC/D/E and test data may be downloaded from the SHELX fileserver. Users should register online at http://shelx.uni-ac.gwdg.de/SHELX/ .  Downloading instructions will then be emailed.  The programs are free to academics but a small license fee is required for 'for-profit' use.  <br>
 
== Installing of the multiprocessor version on a Mac ==
 
 
The mp version of SHELXL runs on all 16 processors of a Mac (two quad core with hyperthreading). In a test case, the refinement with total processor time of 70.7 seconds was finished within less than six seconds:-)
 
The following packages need to be installed before the compilation:
 
* XCode 312_2621_developerdvd.dmg  (downloaded from apple - 996 MB)
* Intel fortran compiler Professional 31 day evaluation version)
*# m_cprof_p_11.0.059.dmg  (downloaded from intel - 343 MB)
*# m_cprof_ifort_redist_p_11.0.059.dmg (downloaded from intel - 20,3 MB)
 
the compilation works smoothly, but instead of -static flag, it is necessary to use a -static-intel flag. A 64 bit compilation is invoked with:
 
ifort -axPT -openmp -ip -static-intel shelxh_omp.f shelxlv_omp.f -o shelxl_omp.64bit


Update 6/2010: Problems exist with Xcode 3.2.2 . The workaround is to add the -use-asm flag. See http://software.intel.com/en-us/articles/intel-fortran-for-mac-os-x-incompatible-with-xcode-322/


== References and other sources of information ==
== References and other sources of information ==
Line 115: Line 178:
Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122 [''Standard reference for all SHELX... programs''].<br>
Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122 [''Standard reference for all SHELX... programs''].<br>


Sheldrick, G.M. & Schneider, T.R. (1997). ''Methods Enzymol''. '''277''', 319-343 [Macromolecular refinement with SHELXL].
Gruene, T. et ''al.'' (2014). "Refinement of Macromolecular Structures against Neutron Data with SHELXL-2013". ''J. Appl. Cryst.''. '''47''', 462-466 [''Reference for refinement against neutron data and for hydrogen restraints''].
 
Sheldrick, G.M. & Schneider, T.R. (1997). ''Methods Enzymol''. '''277''', 319-343 [''Macromolecular refinement with SHELXL''].


The following additional sources of information may be found via the SHELX homepage (http://shelx.uni-ac.gwdg.de/SHELX):  "SHELX-97 Manual as PDF", "Mini-protein refinement tutorial". "P1-Lysozyme refinement tutorial", "Thomas Schneider's FAQs" and "FAQs: Macromolecules"
The following additional sources of information may be found via the SHELX homepage (http://shelx.uni-ac.gwdg.de/SHELX):  "SHELX-97 Manual as PDF", "Mini-protein refinement tutorial". "P1-Lysozyme refinement tutorial", "Thomas Schneider's FAQs" and "FAQs: Macromolecules"
25

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu