SHELX C/D/E: Difference between revisions

Jump to navigation Jump to search
2,468 bytes added ,  30 January 2020
no edit summary
No edit summary
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
SHELXC, SHELXD and SHELXE are stand-alone executables that do not require environment variables or parameter files etc., so all that is needed to install them is to put them in a directory that is in the ‘path’ (e.g. /usr/local/bin or ~/bin under Linux). There is a detailed description of these programs in the paper: <i>"Experimental phasing with SHELXC/D/E: combining chain tracing with density modification"</i>. Sheldrick, G.M. (2010). <i>Acta Cryst.</i> <b>D66</b>, 479-485. It is  
SHELXC, SHELXD and SHELXE are stand-alone executables that do not require environment variables or parameter files etc., so all that is needed to install them is to put them in a directory that is in the ‘path’ (e.g. /usr/local/bin or ~/bin under Linux). There is a detailed description of these programs in the paper: <i>"Experimental phasing with SHELXC/D/E: combining chain tracing with density modification"</i>. Sheldrick, G.M. (2010). <i>Acta Cryst.</i> <b>D66</b>, 479-485. It is  
available as "Open Access" at http://dx.doi.org/10.1107/S0907444909038360 and should be cited whenever these programs are used.
available as "Open Access" at http://dx.doi.org/10.1107/S0907444909038360 and should be cited whenever these programs are used.
[[hkl2map]] is a graphical user interface that makes it easy to use these programs.
[[xds:xdsgui|XDSGUI]] is a graphical user interface for XDS that also makes it easy to use these programs.




Line 48: Line 53:
data should be in SHELX .hkl or SCALEPACK .sca format; many other programs,
data should be in SHELX .hkl or SCALEPACK .sca format; many other programs,
including SCALA and XPREP, can output .sca format too. The keywords CELL,
including SCALA and XPREP, can output .sca format too. The keywords CELL,
SPAG (space group) SPAG (space group) and FIND (number of heavy atoms) are
SPAG (space group) and FIND (number of heavy atoms) are
always required, SFAC, MIND, NTRY, SHEL, ESEL and DSUL may be given and are
always required, SFAC, MIND, NTRY, SHEL, ESEL and DSUL may be given and are
written to the file xx_fa.ins for SHELXD. MAXM can be used to reserve
written to the file xx_fa.ins for SHELXD. MAXM can be used to reserve
Line 72: Line 77:
inconsistent indexing when more than one dataset is involved. In addition,
inconsistent indexing when more than one dataset is involved. In addition,
the mean value of |E^2-1| is calculated for each dataset to detect twinning.
the mean value of |E^2-1| is calculated for each dataset to detect twinning.


== SHELXD ==
== SHELXD ==


=== critical parameters ===


In general the critical parameters for locating heavy atoms with SHELXD are:
In general the critical parameters for locating heavy atoms with SHELXD are:


# The resolution cutoff. In the MAD case this is best determined by finding where the correlation coefficient between the signed anomalous differences for wavelengths with the highest anomalous signal (PEAK and HREM or PEAK and INFL) falls below about 30%. For SAD a less reliable guide is where the mean value of |&Delta;F|/&sigma;(&Delta;F) falls below about 1.2 (a value of 0.8 would indicate pure noise), and for S-SAD with CuK&alpha; the data can be truncated where I/&sigma; for the native data falls below 30. If unmerged data are used, SHELXC calculates a correlation coefficient between two randomly selected subsets of the signed anomalous differences; this is a better indicator because it does not require that the intensity esds are on an absolute scale, but it does require a reasonable redundancy and again the data can be truncated where it drops to below 30% (the CCP4 program SCALA prints a similar statistic).
=== Resolution cutoff (SHEL) ===
# The estimated number of sites (FIND) should be within about 20% of the true number. For SeMet or S-SAD phasing there should be a sharp drop in the occupancy after the last true site. For iodide soaks, a good rule of thumb is to start with a number of iodide sites equal to the number of amino-acids in the asymmetric unit divided by 15. If after SHELXD occupancy refinement the occupancy of the last site is more than 0.2 it might be worth increasing this number, and vice versa.
In the MAD case this is best determined by finding where the correlation coefficient between the signed anomalous differences for wavelengths with the highest anomalous signal (PEAK and HREM or PEAK and INFL) falls below about 30%. For SAD a less reliable guide is where the mean value of |&Delta;F|/&sigma;(&Delta;F) falls below about 1.2 (a value of 0.8 would indicate pure noise), and for S-SAD with CuK&alpha; the data can be truncated where I/&sigma; for the native data falls below 30. If unmerged data are used, SHELXC calculates a correlation coefficient between two randomly selected subsets of the signed anomalous differences; this is a better indicator because it does not require that the intensity esds are on an absolute scale, but it does require a reasonable redundancy and again the data can be truncated where it drops to below 30% (XDS and the CCP4 programs aimless/SCALA print a similar statistic).
# A common 'user error' is to set MIND -3.5 even though the distances between heavy atoms are less than 3.5 Å.  For example, in a Fe<sub>4</sub>S<sub>4</sub> cluster the Fe...Fe distance is about 2.7 Å, so MIND -2 would be appropriate. A disulfide bond has a length of 2.03 Å so then MIND -1.5 could be used to resolve the sulfur atoms, however if DSUL is used for this purpose MIND -3.5 is required.
# If heavy atoms can lie on special positions (as is the case with an iodide soak in a space group with twofold axes) the rejection of atoms on special positions should be switched off by giving the second MIND parameter as -0.1 (as in the above thaumatin example).
# In cubic space groups the Patterson seeding (PATS) is slow and less effective, it is recommended that 'PATS' is replaced by 'WEED 0.3'.<br>


=== Number of sites (FIND) ===
The estimated number of sites (FIND) should be within about 20% of the true number. For SeMet or S-SAD phasing there should be a sharp drop in the occupancy after the last true site. For iodide soaks, a good rule of thumb is to start with a number of iodide sites equal to the number of amino-acids in the asymmetric unit divided by 15. If after SHELXD occupancy refinement the occupancy of the last site is more than 0.2 it might be worth increasing this number, and vice versa.
It should be noted that the number of sites that SHELXD will search for is 40% higher than what is asked for by the user, in FIND. The reason for this is that there are often additional minor sites arising from heavy atoms, like Cl or Ca. So if you don't adjust FIND downwards, after an initial SHELXD run, such that the Nth site in the .res file has occupancy > 0.2, then you could either edit the .res file and remove the sites with occupancy < 0.2, or run SHELXE with -hN where N is the site number which has occupancy > 0.2 .
=== Disulfides (DSUL) ===
If the resolution d (second parameter on SHEL card) is > 2.0Å the disulfide bonds may not fully resolved, but in the range 2.8>d>2.0 the DSUL instruction may be used to fit S−S units to the density. This can dramatically improve the final phase quality. If DSUL is used, the first MIND parameter should be set to -3.5 (so that each disulfide is found once only) and disulfides should be counted as single (super-sulfur) atoms for FIND (i.e. each disulfide given in DSUL counts as two atoms for FIND).
=== Minimum distance between atoms (MIND) ===
A common 'user error' is to set MIND -3.5 even though the distances between heavy atoms are less than 3.5 Å.  For example, in a Fe<sub>4</sub>S<sub>4</sub> cluster the Fe...Fe distance is about 2.7 Å, so MIND -2 would be appropriate. A disulfide bond has a length of 2.03 Å so then MIND -1.5 could be used to resolve the sulfur atoms, however if DSUL is used for this purpose MIND -3.5 is required.
If heavy atoms can lie on special positions (as is the case with an iodide soak in a space group with twofold axes) the rejection of atoms on special positions should be switched off by giving the second MIND parameter as -0.1 (as in the above thaumatin example).
=== Interpretation of results ===
For MAD, a CC of 40 to 50% indicates a good solution, for SAD etc. values around 30% may well be correct, especially if the same solution or group of solutions has the highest values of CC, CC(Weak) and PATFOM, and they are well separated from the values for the non-solutions.  The CC values tend to increase as the resolution is lowered.  Heavy atom soaks truncated to low resolution often give spuriously high CC values, but these 'solutions' can be recognized as false by their low CC(weak) values.<br>
For MAD, a CC of 40 to 50% indicates a good solution, for SAD etc. values around 30% may well be correct, especially if the same solution or group of solutions has the highest values of CC, CC(Weak) and PATFOM, and they are well separated from the values for the non-solutions.  The CC values tend to increase as the resolution is lowered.  Heavy atom soaks truncated to low resolution often give spuriously high CC values, but these 'solutions' can be recognized as false by their low CC(weak) values.<br>


Line 169: Line 183:
  -AX - maximum random initial rotation in deg. for -O [-A3.0]
  -AX - maximum random initial rotation in deg. for -O [-A3.0]
  -bX - B-value to weight anomalous map (xx.pha and xx.hat) [-b5.0]
  -bX - B-value to weight anomalous map (xx.pha and xx.hat) [-b5.0]
-B or -B1 - refine one B-value for complete trace [off]
-B2 - refine one B-value per traced chain [off]
-B3 - refine one B-value per traced residue [on]
  -cX - fraction of pixels in crossover region [-c0.4]
  -cX - fraction of pixels in crossover region [-c0.4]
  -dX - truncate reflection data to X Angstroms [off]
  -dX - truncate reflection data to X Angstroms [off]
Line 181: Line 192:
  -h or -hN - (N) heavy atoms also present in native structure [-h0]
  -h or -hN - (N) heavy atoms also present in native structure [-h0]
  -i  - invert space group and input (sub)structure or phases [off]
  -i  - invert space group and input (sub)structure or phases [off]
  -IN - in global cycle 1 only, do N cycles DM (free lunch if -e set) [off]
  -IN - in cycle 1 only, do N cycles DM (free lunch if -e) [off]
  -kX - minimum height/sigma for heavy atom sites in xx.hat [-k4.5]
  -kX - minimum height/sigma for heavy atom sites in xx.hat [-k4.5]
  -KN - keep starting fragment unchanged for N global cycles [off]
  -KN - keep starting fragment unchanged for N global cycles [off]
Line 201: Line 212:
  -x  - diagnostics, requires PDB reference file xx.ent [off]
  -x  - diagnostics, requires PDB reference file xx.ent [off]
  -yX - highest resol. in Ang. for calc. phases from xx.pda [-y1.8]
  -yX - highest resol. in Ang. for calc. phases from xx.pda [-y1.8]
-YX - SAD phase shift factor [-Y0.5]
  -zN - substructure optimization for a maximum of N atoms [off]
  -zN - substructure optimization for a maximum of N atoms [off]
  -z - subucture optimization, number of atoms not limited [off]
  -z - substructure optimization, number of atoms not limited [off]
  -ZX - maximum shift in Ang. from initial position for -O [-Z1.0]
  -ZX - maximum shift in Ang. from initial position for -O [-Z1.0]
Meaning of additional output when using the -x option:
MPE and wMPE are given as two numbers, the one after the '/' is for centric reflections only.
The first nine numbers in the row after locating a strand or in the 'Global chain diagnostics' are the percentages of CA within 0-0.1, 0.1-0.2, 0.2-0.3Å etc from the nearest CA in the reference structure. The tenth number is the percentage further than 0.9Å from the nearest CA.
The next number is 100 times the number of CA found divided by the number expected for the whole structure. The last number is the mean distance of a CA atom from the nearest CA in the reference structure, whereby distances greater than 2.5Å are replaced by 2.5. One should always look at the second number from the right; for a good trace it should be as low as possible. If you are expanding from a MR solution the program also tells you the percentages of starting atoms retained.


=== Phasing and density modification ===
=== Phasing and density modification ===
Line 226: Line 246:
=== The free lunch algorithm (FLA) ===
=== The free lunch algorithm (FLA) ===


The new switch -e may be used to extrapolate the data to the specified resolution (the '''''free lunch algorithm'''''), based closely on work by the Bari group (Caliandro ''et al''., ''Acta Crystallogr''. (2005) '''D61''', 556-565) and independently implemented in the program [[Acorn]] (Yao ''et al''., (2005) ''Acta Crystallogr''. '''D61''', 1465-1475): -e1.0 can produce spectacular results when applied to data collected to 1.6 to 2.0 Å, but since a large number of cycles is required (-m400) and the 'contrast' and 'connectivity' become unreliable (the pseudo-free CC is the only reliable map quality indicator when the FLA is used), it may be best to establish the substructure enantiomorph and solvent content without -e first. The default setting when -e is not specified is to fill in missing low and medium resolution data but not to extrapolate to higher resolution than actually measured (to switch off this filling in, use -e999). The resolution requirements for the FLA still need to be explored, but so far there have been no reports of it causing a deterioration in map quality, and in a few cases the mean phase error was reduced by as much as 30º relative to density modification without it.<br>
The switch -e may be used to extrapolate the data to the specified resolution (the '''''free lunch algorithm'''''), based closely on work by the Bari group (Caliandro ''et al''., ''Acta Crystallogr''. (2005) '''D61''', 556-565) and independently implemented in the program [[Acorn]] (Yao ''et al''., (2005) ''Acta Crystallogr''. '''D61''', 1465-1475): -e1.0 can produce spectacular results when applied to data collected to 1.6 to 2.0 Å, but since a large number of cycles is required (-m400) and the 'contrast' and 'connectivity' become unreliable (the pseudo-free CC is the only reliable map quality indicator when the FLA is used), it may be best to establish the substructure enantiomorph and solvent content without -e first. The default setting when -e is not specified is to fill in missing low and medium resolution data but not to extrapolate to higher resolution than actually measured (to switch off this filling in, use -e999). The resolution requirements for the FLA still need to be explored, but so far there have been no reports of it causing a deterioration in map quality, and in a few cases the mean phase error was reduced by as much as 30º relative to density modification without it.<br>


=== how to find out if a molecular replacement solution is correct or wrong ===
=== How to find out if a molecular replacement solution is correct or wrong ===


From a [https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1111&L=ccp4bb&F=&S=&P=41951 November 2011 posting of George Sheldrick on CCP4BB]: We have unintentionally discovered a very simple way of telling whether
From a [https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1111&L=ccp4bb&F=&S=&P=41951 November 2011 posting of George Sheldrick on CCP4BB]: We have unintentionally discovered a very simple way of telling whether
Line 254: Line 274:
for a solved structure (25 to 50%). The solution with the best CC is
for a solved structure (25 to 50%). The solution with the best CC is
written to name.pdb and its phases to name.phs for input to e.g. Coot.
written to name.pdb and its phases to name.phs for input to e.g. Coot.
=== How to tell SHELXE about NCS in a molecular replacement solution PDB file ===
(communicated by Isabel Uson) Insert a line
REMARK 299 NCS GROUP BEGIN
before the ATOM (or HETATM) lines of each NCS group (e.g. chain), and insert the line
REMARK 299 NCS GROUP END
after the last of these. The -n option is not needed then. The output of SHELXE should tell you about the fact that it understood the NCS specification.


== RIP with SHELXC/D/E ==
== RIP with SHELXC/D/E ==
Line 336: Line 364:
==  SAD/MAD with automatic backbone building ==
==  SAD/MAD with automatic backbone building ==


  shelxe.beta exp1 exp1_fa -a -q -h -s0.6 -m20 -b
  shelxe exp1 exp1_fa -a -q -h -s0.6 -m20 -b


will use exp1.hkl, exp1_fa.hkl, exp1.ins (as above) and will try 3 cycles of backbone building.
will use exp1.hkl, exp1_fa.hkl, exp1.ins (as above) and will try 3 cycles of backbone building.
Line 359: Line 387:
== Obtaining the SHELX programs ==
== Obtaining the SHELX programs ==


SHELXC/D/E and test data may be downloaded from the SHELX fileserver (shelx97 directory). The application form should be printed out from http://shelx.uni-ac.gwdg.de/SHELX/ This form should be completed and faxed to +49-551-3922582.  Downloading instructions will then be emailed to the address given on the form, so please write the email address CLEARLY.  The programs are free to academics but a small license fee is required for 'for-profit' use.   
SHELXC/D/E and test data may be downloaded from the [http://shelx.uni-goettingen.de/bin/ SHELX fileserver]. First fill the application form at http://shelx.uni-goettingen.de/register.php Password and downloading instructions will then be emailed to the address given on the form.  The programs are free to academics but a small license fee is required for 'for-profit' use.   


Beta-test versions are also available from time to time. They are announced by George Sheldrick and are available from the beta-test directory. The username and password for accessing these may be obtained from GS.
Beta-test versions are also available from time to time. They are announced by George Sheldrick and are available from the beta-test directory. The username and password for accessing these may be obtained from GS.


The data merging and internal correlation coefficient as a function of resolution have been fine-tuned in the beta-test SHELXC (available since Aug 19, 2011). The changes only apply when unmerged data are read in and should improve very weak noisy anomalous data.
[[hkl2map]] can be downloaded from a website at EMBL Hamburg.


== References ==
== References ==
Line 369: Line 397:
If these programs prove useful, you may wish to cite (and read!):<br>
If these programs prove useful, you may wish to cite (and read!):<br>


Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122 [''Standard reference for all SHELX... programs''].<br>
[http://scripts.iucr.org/cgi-bin/paper?sc5010 Sheldrick, G.M. (2008). "A short history of SHELX", ''Acta Crystallogr''. '''D64''', 112-122] [''Standard reference for all SHELX* programs''].<br>


Sheldrick, G.M., Hauptman, H.A., Weeks, C.M., Miller, R. & Usón, I. (2001). "Ab initio phasing". In ''International Tables for Crystallography'', Vol. F, Eds. Rossmann, M.G. & Arnold, E., IUCr and Kluwer Academic Publishers, Dordrecht pp. 333-351 [''Full background to the dual-space recycling used in SHELXD''].<br>
Sheldrick, G.M., Hauptman, H.A., Weeks, C.M., Miller, R. & Usón, I. (2001). "Ab initio phasing". In ''International Tables for Crystallography'', Vol. F, Eds. Rossmann, M.G. & Arnold, E., IUCr and Kluwer Academic Publishers, Dordrecht pp. 333-351 [''Full background to the dual-space recycling used in SHELXD''].<br>
Line 375: Line 403:
Schneider, T.R. & Sheldrick, G.M. (2002). "Substructure Solution with SHELXD", ''Acta Crystallogr''. '''D58''', 1772-1779 [''Heavy atom location with SHELXD''].<br>
Schneider, T.R. & Sheldrick, G.M. (2002). "Substructure Solution with SHELXD", ''Acta Crystallogr''. '''D58''', 1772-1779 [''Heavy atom location with SHELXD''].<br>


Sheldrick, G.M. (2002), "Macromolecular phasing with SHELXE", ''Z. Kristallogr''. '''217''', 644-650 [''The definitive reference for SHELXE, usually cited wrongly''].<br>
Sheldrick, G.M. (2002), "Macromolecular phasing with SHELXE", ''Z. Kristallogr''. '''217''', 644-650  


Nanao, M.H., Sheldrick, G.M. & Ravelli, R.B.G. (2005). "Improving radiation-damage substructures for RIP", ''Acta Crystallogr''. '''D61''', 1227-1237 [''Practical details of RIP phasing with SHELXC/D/E''].<br>
Nanao, M.H., Sheldrick, G.M. & Ravelli, R.B.G. (2005). "Improving radiation-damage substructures for RIP", ''Acta Crystallogr''. '''D61''', 1227-1237 [''Practical details of RIP phasing with SHELXC/D/E''].<br>


Uson, I., Stevenson, C.E.M., Lawson, D.M. & Sheldrick, G.M. (2007). "Structure determination of the O-methyltransferase NovP using the `free lunch algorithm' as implemented in SHELXE", ''Acta Crystallogr''. '''D63''', 1069-1074 [''Implementation of the FLA in SHELXE''].<br>
Uson, I., Stevenson, C.E.M., Lawson, D.M. & Sheldrick, G.M. (2007). "Structure determination of the O-methyltransferase NovP using the `free lunch algorithm' as implemented in SHELXE", ''Acta Crystallogr''. '''D63''', 1069-1074 [''Implementation of the FLA in SHELXE''].<br>
[http://dx.doi.org/10.1107/S0907444909038360 Sheldrick, G.M. (2010). "Experimental phasing with SHELXC/D/E: combining chain tracing with density modification", ''Acta Cryst'' '''D66''', 479-485.]
[https://doi.org/10.1107/S0907444913027534 A. Thorn and Sheldrick, G.M. (2013) Extending molecular-replacement solutions with SHELXE. ''Acta Cryst'' '''D69''', 2251-2256.]
<br>
<br>
See also the SHELX homepage at: http://shelx.uni-ac.gwdg.de/SHELX/
See also the [http://shelx.uni-goettingen.de/ SHELX homepage]
<br>
<br>
1,328

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu