SHELX C/D/E: Difference between revisions

6,714 bytes added ,  5 February
m
(→‎critical parameters: reason to use MIND -3.5)
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
SHELXC, SHELXD and SHELXE are stand-alone executables that do not require environment variables or parameter files etc., so all that is needed to install them is to put them in a directory that is in the ‘path’ (e.g. /usr/local/bin or ~/bin under Linux). There is a detailed description of these programs in the paper: <i>"Experimental phasing with SHELXC/D/E: combining chain tracing with density modification"</i>. Sheldrick, G.M. (2010). <i>Acta Cryst.</i> <b>D66</b>, 479-485. It is  
SHELXC, SHELXD and SHELXE are stand-alone executables that do not require environment variables or parameter files etc., so all that is needed to install them is to put them in a directory that is in the ‘path’ (e.g. /usr/local/bin or ~/bin under Linux). There is a detailed description of these programs in the paper: <i>"Experimental phasing with SHELXC/D/E: combining chain tracing with density modification"</i>. Sheldrick, G.M. (2010). <i>Acta Cryst.</i> <b>D66</b>, 479-485. It is  
available as "Open Access" at http://dx.doi.org/10.1107/S0907444909038360 and should be cited whenever these programs are used.
available as "Open Access" at http://dx.doi.org/10.1107/S0907444909038360 and should be cited whenever these programs are used.
[[hkl2map]] is a graphical user interface that makes it easy to use these programs.
[[xds:xdsgui|XDSGUI]] is a graphical user interface for XDS that also makes it easy to use these programs.




Line 75: Line 80:
== SHELXD ==
== SHELXD ==


=== critical parameters ===


In general the critical parameters for locating heavy atoms with SHELXD are:
In general the critical parameters for locating heavy atoms with SHELXD are:


# The resolution cutoff. In the MAD case this is best determined by finding where the correlation coefficient between the signed anomalous differences for wavelengths with the highest anomalous signal (PEAK and HREM or PEAK and INFL) falls below about 30%. For SAD a less reliable guide is where the mean value of |&Delta;F|/&sigma;(&Delta;F) falls below about 1.2 (a value of 0.8 would indicate pure noise), and for S-SAD with CuK&alpha; the data can be truncated where I/&sigma; for the native data falls below 30. If unmerged data are used, SHELXC calculates a correlation coefficient between two randomly selected subsets of the signed anomalous differences; this is a better indicator because it does not require that the intensity esds are on an absolute scale, but it does require a reasonable redundancy and again the data can be truncated where it drops to below 30% (the CCP4 program SCALA prints a similar statistic).
=== Resolution cutoff (SHEL) ===
# The estimated number of sites (FIND) should be within about 20% of the true number. For SeMet or S-SAD phasing there should be a sharp drop in the occupancy after the last true site. For iodide soaks, a good rule of thumb is to start with a number of iodide sites equal to the number of amino-acids in the asymmetric unit divided by 15. If after SHELXD occupancy refinement the occupancy of the last site is more than 0.2 it might be worth increasing this number, and vice versa.
In the MAD case this is best determined by finding where the correlation coefficient between the signed anomalous differences for wavelengths with the highest anomalous signal (PEAK and HREM or PEAK and INFL) falls below about 30%. For SAD a less reliable guide is where the mean value of |&Delta;F|/&sigma;(&Delta;F) falls below about 1.2 (a value of 0.8 would indicate pure noise), and for S-SAD with CuK&alpha; the data can be truncated where I/&sigma; for the native data falls below 30. If unmerged data are used, SHELXC calculates a correlation coefficient between two randomly selected subsets of the signed anomalous differences; this is a better indicator because it does not require that the intensity esds are on an absolute scale, but it does require a reasonable redundancy and again the data can be truncated where it drops to below 30% (XDS and the CCP4 programs aimless/SCALA print a similar statistic).
# If the resolution d (second parameter on SHEL card) is > 2.0Å the disulfide bonds may not fully resolved, but in the range 2.8>d>2.0 the DSUL instruction may be used to fit S−S units to the density. This can dramatically improve the final phase quality. If DSUL is used, the first MIND parameter should be set to -3.5 (so that each disulfide is found once only) and disulfides should be counted as single (super-sulfur) atoms for FIND.
 
# A common 'user error' is to set MIND -3.5 even though the distances between heavy atoms are less than 3.5 Å.  For example, in a Fe<sub>4</sub>S<sub>4</sub> cluster the Fe...Fe distance is about 2.7 Å, so MIND -2 would be appropriate. A disulfide bond has a length of 2.03 Å so then MIND -1.5 could be used to resolve the sulfur atoms, however if DSUL is used for this purpose MIND -3.5 is required.
=== Number of sites (FIND) ===
# If heavy atoms can lie on special positions (as is the case with an iodide soak in a space group with twofold axes) the rejection of atoms on special positions should be switched off by giving the second MIND parameter as -0.1 (as in the above thaumatin example).
The estimated number of sites (FIND) should be within about 20% of the true number. For SeMet or S-SAD phasing there should be a sharp drop in the occupancy after the last true site. For iodide soaks, a good rule of thumb is to start with a number of iodide sites equal to the number of amino-acids in the asymmetric unit divided by 15. If after SHELXD occupancy refinement the occupancy of the last site is more than 0.2 it might be worth increasing this number, and vice versa.
# In cubic space groups the Patterson seeding (PATS) is slow and less effective, it is recommended that 'PATS' is replaced by 'WEED 0.3'.<br>
 
It should be noted that the number of sites that SHELXD will search for is 40% higher than what is asked for by the user, in FIND. The reason for this is that there are often additional minor sites arising from heavy atoms, like Cl or Ca. So if you don't adjust FIND downwards, after an initial SHELXD run, such that the Nth site in the .res file has occupancy > 0.2, then you could either edit the .res file and remove the sites with occupancy < 0.2, or run SHELXE with -hN where N is the site number which has occupancy > 0.2 .
 
=== Disulfides (DSUL) ===
If the resolution d (second parameter on SHEL card) is > 2.0Å the disulfide bonds may not fully resolved, but in the range 2.8>d>2.0 the DSUL instruction may be used to fit S−S units to the density. This can dramatically improve the final phase quality. If DSUL is used, the first MIND parameter should be set to -3.5 (so that each disulfide is found once only) and disulfides should be counted as single (super-sulfur) atoms for FIND (i.e. each disulfide given in DSUL counts as two atoms for FIND).
 
=== Minimum distance between atoms (MIND) ===
A common 'user error' is to set MIND -3.5 even though the distances between heavy atoms are less than 3.5 Å.  For example, in a Fe<sub>4</sub>S<sub>4</sub> cluster the Fe...Fe distance is about 2.7 Å, so MIND -2 would be appropriate. A disulfide bond has a length of 2.03 Å so then MIND -1.5 could be used to resolve the sulfur atoms, however if DSUL is used for this purpose MIND -3.5 is required.
 
If heavy atoms can lie on special positions (as is the case with an iodide soak in a space group with twofold axes) the rejection of atoms on special positions should be switched off by giving the second MIND parameter as -0.1 .


=== Interpretation of results ===
For MAD, a CC of 40 to 50% indicates a good solution, for SAD etc. values around 30% may well be correct, especially if the same solution or group of solutions has the highest values of CC, CC(Weak) and PATFOM, and they are well separated from the values for the non-solutions.  The CC values tend to increase as the resolution is lowered.  Heavy atom soaks truncated to low resolution often give spuriously high CC values, but these 'solutions' can be recognized as false by their low CC(weak) values.<br>
For MAD, a CC of 40 to 50% indicates a good solution, for SAD etc. values around 30% may well be correct, especially if the same solution or group of solutions has the highest values of CC, CC(Weak) and PATFOM, and they are well separated from the values for the non-solutions.  The CC values tend to increase as the resolution is lowered.  Heavy atom soaks truncated to low resolution often give spuriously high CC values, but these 'solutions' can be recognized as false by their low CC(weak) values.<br>


Line 164: Line 178:
(or xx_i.hat).
(or xx_i.hat).


=== Full list of SHELXE options (defaults in brackets) ===
=== Full SHELXE help output ===


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+  SHELXE  -  PHASING AND DENSITY MODIFICATION  -  Version 2023/1  +
+  Copyright (c)  George M. Sheldrick and Isabel Uson 2001-23      +
+  Started at 18:30:57 on 24 Jan 2024                              +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A typical SHELXE job for SAD, MAD, SIR or SIRAS phasing could be:
shelxe xx xx_fa -s0.5 -z -a10 -O
where xx.hkl contains native data and xx_fa.hkl, which should have
been created by SHELXC or XPREP, contains FA and alpha. The heavy
atoms are read from xx_fa.res, which can be generated by SHELXD or
ANODE. 'xx' and 'xx_fa' may be replaced by any strings that make
legal file names. If these heavy atom are present in the native
structure (e.g. for sulfur-SAD but not SIRAS for an iodide soak)
-h is required (or e.g. -h8 to use only the first 8). -z optimizes
the substructure at the start of the phasing. -z9 limits the number
of heavy atoms to 9. If -z is specified without a number,
no limit is imposed. Normally the heavy atom enantiomorph is not
known, so SHELXE should also be run with the -i switch to invert
the heavy atoms and if necessary the space group; this writes
files xx_i.phs instead of xx.phs etc., so may be run in parallel.
-a sets the number of global autotracing cycles. -a not followed
by a number sets 30 cycles or three cycles after a CC of 30 has been
exceeded, whichever' is less. -n generates NCS operators from heavy
atom positions, e.g. -n6 for six-fold NCS or -n if the number of
copies is not known. -n imposes NCS during tracing.' if NCS is
defined in a pda file -n may not be used. -p traces a DNA or RNA
backbone, -p10 would restrict this search to 10 phosphates.
To start from a MR model without other phase information, the PDB
file from MR should be renamed xx.pda and input to SHELXE, e.g.
shelxe xx.pda -s0.5 -a20
The number of tracing cycles is usually more here to reduce model
bias. If the MR model is large but does not fit well, -o
should be included to prune it before density modification, the
revised model is then writen to xx.pdo.
Tracing from an MR model requires a favorable combination of model
quality, solvent content and data resolution. If e.g. SAD phase
information is available, even if it is too weak for phasing on
its own, the two approaches may be combined:
shelxe xx.pda xx_fa -s0.5 -a10 -h -z
The phases from the MR model are used to generate the heavy atom
substructure. This is used to derive experimental phases that are
then combined with the phases from the MR model (MRSAD). The -h,
-o and -z flags are often needed for this mode.
If approximate phases are available, SHELXE may be used to refine
them and make a poly-Ala trace:
shelxe xx.zzz -s0.5 -a3
where zzz is phi (phs file format), fcf (from SHELXL) or hlc
(Hendrickson-Lattman coefficients, e.g. from SHARP or BP3).
In all cases, native data are read from xx.hkl in SHELX format,
and the density modified phases are output to xx.phs (or xx_i.phs
if -i was set). The listing file is xx.lst (or xx_i.lst). If
xx_fa.hkl is read, substructure phases are output to xx.pha (or
xx_i.pha) and the revised substructure is written to xx.hat
(or xx_i.hat).' If -o is used to improve a model in xx.pda, the
revised model is output to xx.pdo.
Full list of SHELXE options (defaults in brackets):
==================================================
  -aN - N cycles autotracing [off]
  -aN - N cycles autotracing [off]
-AX - maximum random initial rotation in deg. for -O [-A3.0]
  -bX - B-value to weight anomalous map (xx.pha and xx.hat) [-b5.0]
  -bX - B-value to weight anomalous map (xx.pha and xx.hat) [-b5.0]
  -B or -B1 - refine one B-value for complete trace [off]
  -B1 - anti-parallel beta sheet, -B2 parallel and -B3 both [off]
-B2 - refine one B-value per traced chain [off]
-B3 - refine one B-value per traced residue [on]
  -cX - fraction of pixels in crossover region [-c0.4]
  -cX - fraction of pixels in crossover region [-c0.4]
  -dX - truncate reflection data to X Angstroms [off]
  -dX - truncate reflection data to X Angstroms [off]
-D  - fuse disulfides before looking for NCS [off]
  -eX - add missing 'free lunch' data up to X Angstroms [dmin+0.2]
  -eX - add missing 'free lunch' data up to X Angstroms [dmin+0.2]
  -f  - read F rather than intensity from native .hkl file [off]
  -f  - read F rather than intensity from native .hkl file [off]
Line 181: Line 268:
  -h or -hN - (N) heavy atoms also present in native structure [-h0]
  -h or -hN - (N) heavy atoms also present in native structure [-h0]
  -i  - invert space group and input (sub)structure or phases [off]
  -i  - invert space group and input (sub)structure or phases [off]
  -IN - in global cycle 1 only, do N cycles DM (free lunch if -e set) [off]
  -IN - in cycle 1 only, do N cycles DM (free lunch if -e) [off]
  -kX - minimum height/sigma for heavy atom sites in xx.hat [-k4.5]
  -kX - minimum height/sigma for heavy atom sites in xx.hat [-k4.5]
  -KN - keep starting fragment unchanged for N global cycles [off]
  -KN - keep starting fragment unchanged for N global cycles [off]
Line 189: Line 276:
  -mN - N iterations of density modification per global cycle [-m20]
  -mN - N iterations of density modification per global cycle [-m20]
  -n or -nN - apply N-fold NCS to traces [off]
  -n or -nN - apply N-fold NCS to traces [off]
-O or -ON - N random-start rigid-group domain searches [off]
  -o or -oN - prune up to N residues to optimize CC for xx.pda [off]
  -o or -oN - prune up to N residues to optimize CC for xx.pda [off]
  -q - search for alpha-helices [off]
  -O  - trace side chains [off]
-p or -pN - search for N DNA or RNA phosphates (-p = -p12) [off]
  -qN - search for alpha-helices of length 6<N<15; -q sets -q7 [off]
-Q  - search for 12-helix,' extended by sliding (overrides -q) [off]
  -rX - FFT grid set to X times maximum indices [-r3.0]
  -rX - FFT grid set to X times maximum indices [-r3.0]
  -sX - solvent fraction [-s0.45]
  -sX - solvent fraction [-s0.45]
  -tX - time factor for helix and peptide search [-t1.0]
-SX - radius of sphere of influence. Increase for low res [-S2.42]
  -tX - time for initial searches (-t3 or more if difficult) [-t1.0]
  -uX - allocable memory in MB for fragment optimization [-u500]
  -uX - allocable memory in MB for fragment optimization [-u500]
  -UX - abort if less than X% of initial CA stay within 0.7A [-U0]
  -UX - abort if less than X% of initial CA stay within 0.7A [-U0]
Line 202: Line 292:
  -yX - highest resol. in Ang. for calc. phases from xx.pda [-y1.8]
  -yX - highest resol. in Ang. for calc. phases from xx.pda [-y1.8]
  -zN - substructure optimization for a maximum of N atoms [off]
  -zN - substructure optimization for a maximum of N atoms [off]
  -z  - subucture optimization, number of atoms not limited [off]
  -z  - substructure optimization, number of atoms not limited [off]
  -ZX - maximum shift in Ang. from initial position for -O [-Z1.0]
  -t values of 3.0 or more switch to more accurate but appreciably
slower tracing algorithms, this is recommended when the resolution
is poor or the initial phase information is weak; -a10 is preferred.
In case of side chain tracing with -O, sequence will be docked
and output only once CC>30 so poly-alanine tracing scores
can be used to identify solutions as before.
Please cite: I. Uson & G.M. Sheldrick (2018), "An introduction to
experimental phasing of macromolecules illustrated by SHELX;
new autotracing features" Acta Cryst. D74, 106-116
(Open Access) if SHELXE proves useful.
 
Meaning of additional output when using the -x option:
 
MPE and wMPE are given as two numbers, the one after the '/' is for centric reflections only.
 
The first nine numbers in the row after locating a strand or in the 'Global chain diagnostics' are the percentages of CA within 0-0.1, 0.1-0.2, 0.2-0.3Å etc from the nearest CA in the reference structure. The tenth number is the percentage further than 0.9Å from the nearest CA.
 
The next number is 100 times the number of CA found divided by the number expected for the whole structure. The last number is the mean distance of a CA atom from the nearest CA in the reference structure, whereby distances greater than 2.5Å are replaced by 2.5. One should always look at the second number from the right; for a good trace it should be as low as possible. If you are expanding from a MR solution the program also tells you the percentages of starting atoms retained.


=== Phasing and density modification ===
=== Phasing and density modification ===
Line 254: Line 363:
for a solved structure (25 to 50%). The solution with the best CC is
for a solved structure (25 to 50%). The solution with the best CC is
written to name.pdb and its phases to name.phs for input to e.g. Coot.
written to name.pdb and its phases to name.phs for input to e.g. Coot.
=== How to tell SHELXE about NCS in a molecular replacement solution PDB file ===
(communicated by Isabel Usón) Insert a line
REMARK 299 NCS GROUP BEGIN
before the ATOM (or HETATM) lines of each NCS group (e.g. chain), and insert the line
REMARK 299 NCS GROUP END
after the last of these. The -n option is not needed then. The output of SHELXE should tell you about the fact that it understood the NCS specification.


== RIP with SHELXC/D/E ==
== RIP with SHELXC/D/E ==
Line 336: Line 453:
==  SAD/MAD with automatic backbone building ==
==  SAD/MAD with automatic backbone building ==


  shelxe.beta exp1 exp1_fa -a -q -h -s0.6 -m20 -b
  shelxe exp1 exp1_fa -a -q -h -s0.6 -m20 -b


will use exp1.hkl, exp1_fa.hkl, exp1.ins (as above) and will try 3 cycles of backbone building.
will use exp1.hkl, exp1_fa.hkl, exp1.ins (as above) and will try 3 cycles of backbone building.
Line 359: Line 476:
== Obtaining the SHELX programs ==
== Obtaining the SHELX programs ==


SHELXC/D/E and test data may be downloaded from the SHELX fileserver (shelx97 directory). The application form should be printed out from http://shelx.uni-ac.gwdg.de/SHELX/ This form should be completed and faxed to +49-551-3922582.  Downloading instructions will then be emailed to the address given on the form, so please write the email address CLEARLY.  The programs are free to academics but a small license fee is required for 'for-profit' use.   
SHELXC/D/E are distributed with [https://www.ccp4.ac.uk/ CCP4].
 
The programs and test data may also be downloaded from the [http://shelx.uni-goettingen.de/bin/ SHELX fileserver]. First fill the application form at http://shelx.uni-goettingen.de/register.php Password and downloading instructions will then be emailed to the address given on the form.  The programs are free to academics but a small license fee is required for 'for-profit' use.   


Beta-test versions are also available from time to time. They are announced by George Sheldrick and are available from the beta-test directory. The username and password for accessing these may be obtained from GS.
Beta-test versions are also available from time to time. They are announced by George Sheldrick and are available from the beta-test directory. The username and password for accessing these may be obtained from GS.


The data merging and internal correlation coefficient as a function of resolution have been fine-tuned in the beta-test SHELXC (available since Aug 19, 2011). The changes only apply when unmerged data are read in and should improve very weak noisy anomalous data.
[[hkl2map]] can be downloaded from a website at EMBL Hamburg. XDSGUI can be downloaded from its [[xds:XDSGUI|XDSwiki article]].
 
== See also ==
 
[[Solve a small-molecule structure]]


== References ==
== References ==


If these programs prove useful, you may wish to cite (and read!):<br>
If these programs prove useful, you may wish to cite (and read!):<br>
Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122 [''Standard reference for all SHELX... programs''].<br>


Sheldrick, G.M., Hauptman, H.A., Weeks, C.M., Miller, R. & Usón, I. (2001). "Ab initio phasing". In ''International Tables for Crystallography'', Vol. F, Eds. Rossmann, M.G. & Arnold, E., IUCr and Kluwer Academic Publishers, Dordrecht pp. 333-351 [''Full background to the dual-space recycling used in SHELXD''].<br>
Sheldrick, G.M., Hauptman, H.A., Weeks, C.M., Miller, R. & Usón, I. (2001). "Ab initio phasing". In ''International Tables for Crystallography'', Vol. F, Eds. Rossmann, M.G. & Arnold, E., IUCr and Kluwer Academic Publishers, Dordrecht pp. 333-351 [''Full background to the dual-space recycling used in SHELXD''].<br>
Line 379: Line 500:
Nanao, M.H., Sheldrick, G.M. & Ravelli, R.B.G. (2005). "Improving radiation-damage substructures for RIP", ''Acta Crystallogr''. '''D61''', 1227-1237 [''Practical details of RIP phasing with SHELXC/D/E''].<br>
Nanao, M.H., Sheldrick, G.M. & Ravelli, R.B.G. (2005). "Improving radiation-damage substructures for RIP", ''Acta Crystallogr''. '''D61''', 1227-1237 [''Practical details of RIP phasing with SHELXC/D/E''].<br>


Uson, I., Stevenson, C.E.M., Lawson, D.M. & Sheldrick, G.M. (2007). "Structure determination of the O-methyltransferase NovP using the `free lunch algorithm' as implemented in SHELXE", ''Acta Crystallogr''. '''D63''', 1069-1074 [''Implementation of the FLA in SHELXE''].<br>
Usón, I., Stevenson, C.E.M., Lawson, D.M. & Sheldrick, G.M. (2007). "Structure determination of the O-methyltransferase NovP using the `free lunch algorithm' as implemented in SHELXE", ''Acta Crystallogr''. '''D63''', 1069-1074 [''Implementation of the FLA in SHELXE''].<br>
 
[http://scripts.iucr.org/cgi-bin/paper?sc5010 Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122] [''Standard reference for all SHELX* programs''].
 
[http://dx.doi.org/10.1107/S0907444909038360 Sheldrick, G.M. (2010). "Experimental phasing with SHELXC/D/E: combining chain tracing with density modification", ''Acta Cryst'' '''D66''', 479-485.]
 
[https://doi.org/10.1107/S0907444913027534 A. Thorn and Sheldrick, G.M. (2013). Extending molecular-replacement solutions with SHELXE. ''Acta Cryst'' '''D69''', 2251-2256.]
 
 
[https://journals.iucr.org/d/issues/2018/02/00/ba5271/ba5271.pdf Usón, I. & Sheldrick, G. M. (2018). An introduction to experimental phasing of macromolecules illustrated by SHELX; new autotracing features. ''Acta Cryst.'' '''D74''', 106-116.]
 


Sheldrick, G.M. (2010). "Experimental phasing with SHELXC/D/E: combining chain tracing with density modification", ''Acta Cryst'' '''D66''', 479-485. "Open Access" at http://dx.doi.org/10.1107/S0907444909038360
[https://journals.iucr.org/d/issues/2024/01/00/qu5004/index.html Usón, I. & Sheldrick, G. M. (2024) Modes and model building in ''SHELXE. Acta Cryst D80, 4-15''.]


<br>
<br>See also the [http://shelx.uni-goettingen.de/ SHELX homepage]
See also the SHELX homepage at: http://shelx.uni-ac.gwdg.de/SHELX/
<br>
<br>
1,330

edits