Phenix
PHENIX (Python-based Hierarchical ENvironment for Integrated Xtallography) is a software suite for the automated determination and refinement of macromolecular structures using X-ray crystallography and other methods. It integrates well with CCP4-formatted files for I/O, is highly automated, and very straightforward to use.
The suite (Phenix home page; documentation) has a GUI program (phenix) which can be used to run the programs, but they also work from the command line.
A short help, such as usage and options, is printed out by all PHENIX command line tools: just type phenix.TOOLNAME and hit Enter (or Return).
There is also version-specific documentation, e.g. http://www.phenix-online.org/version_docs/dev-572 documents development version 572.
The documentation below focuses on the non-GUI commandline tools and may not be complete, nor up-to-date or even correct.
Crystallographic data
phenix.xtriage - assessing data quality
phenix.explore_metric_symmetry - investigate different settings
phenix.explore_metric_symmetry --unit_cell=145,44,67,90,110.5,90 --space_group=C2 --other_unit_cell=67,44,136,90,96,90 --other_space_group=p2
phenix.reflection_statistics - compare datasets
There may be one or two data files.
phenix.xmanip - structure factor file manipulations
phenix.model_vs_data - statistics
not yet documented. Just use "phenix.model_vs_data model.pdb data.hkl" where data.hkl is a reflection file in most of known formats. phenix.model_vs_data can output the map defined as:
[p][m]Fo+[q][D]Fc[kick][filled].
Examples: 2mFo-DFc, 3.2Fo-2.3Fc, Fc, anom, fo-fc_kick.
So, if you say
phenix.model_vs_data model.pdb data.mtz --map=fc
you will get an MTZ file with desired structure factors.
phenix.model_vs_data model.pdb data.mtz --comprehensive
will list (among other things) map CC for all atoms or per residue.
phenix.fmodel - calculate structure factors from model
phenix.cif_as_mtz - convert cif to mtz format
phenix.find_tls_groups
- identifies suitable atom selections for TLS refinement
- similar to TLSMD, but uses cross-validation to yield one unique solution
Experimental phasing
phenix.autosol - experimental phasing "wizard"
phenix.autosol uses HYSS, SOLVE, Phaser, RESOLVE, xtriage and phenix.refine to solve a structure and generate experimental phases with the MAD, MIR, SIR, or SAD methods
Molecular replacement
phenix.automr - interface to Phaser and Resolve
This "wizard" provides an interface to Phaser molecular replacement and feeds the results of molecular replacement directly into the AutoBuild Wizard for automated model rebuilding
phenix.phaser
Officially documented in the phaserwiki. It can be run from the commandline (and can serve as a replacement for the CCP4 phaser which is an older version!) and by the Phaser-MR GUI which supports the fine-tuning of parameters.
If you run this:
phenix.phaser params.eff
it will use the Phenix-style configuration file, but if you just run "phenix.phaser" with no arguments (or a shell redirect from a file), it will use the CCP4-style keyword input.
This is an example of params.eff.
phenix.sculptor - automate selection and editing of molecular replacement (MR) models
phenix.ensembler - multiple superposition tool to automate construction of ensembles for MR
Ligands
phenix.reel - restraints editor especially for ligands
phenix.elbow - electronic Ligand Builder and Optimisation Workbench
Model building and completion
phenix.autobuild - "wizard" for model rebuilding and completion
phenix.phase_and_build, phenix.build_one_model are fast ways to obtain results.
phenix.ligandfit - "wizard" carrying out fitting of flexible ligands to electron density maps
phenix.find_helices - rapid helix fitting to a map
phenix.fit_loops - fill short gaps using a loop library, and longer gaps (up to 15 residues) iteratively
phenix.assign_sequence - sequence assignment and linkage of neighboring segments
phenix.ligand_identification
Refinement with phenix.refine
Example for use of phenix.refine
basic usage
phenix.refine model.pdb data.mtz
Here "data.mtz" is your reflection data file. PHENIX automatically recognizes most of the known file formats, so it can be MTZ, CNS or ...
advanced usage
phenix.refine model.pdb data.mtz strategy=rigid_body+individual_sites+individual_adp \ simulated_annealing=true optimize_wxc=true optimize_wxu=true main.number_of_macro_cycles=5 \ ordered_solvent=True
This will do the following:
- Rigid body refinement first cycle only (MZ protocol = VERY high convergence radius);
- Refinement of individual xyz and b-factors every cycle with optimized weights (warning: optimize_wxc=true optimize_wxu=true makes the program use much more time!);
- Simulated annealing at 2nd and one before the last cycles;
- find (and remove if necessary) water molecules
Warning: the file model.pdb in this example should not have any ANISOU records! If it has any, these would be refined as individual anisotropic which is most likely not desired.
Ligands
If some ligand in model.pdb is unknown, phenix.refine will complain:
Sorry: Fatal problems interpreting PDB file: Number of atoms with unknown nonbonded energy type symbols: 18 Please edit the PDB file to resolve the problems and/or supply a CIF file with matching restraint definitions, along with apply_cif_modification and apply_cif_link parameter definitions if necessary (see phenix.refine documentation). Also note that phenix.elbow is available to create restraint definitions for unknown ligands.
In that case, just running
phenix.elbow model.pdb --do-all --output=all_ligands
will produce all_ligands.cif, which may be fed to phenix.refine by
phenix.refine model.pdb data.mtz all_ligands.cif ...
If no PDB file for a ligand is available, its SMILES string should be input to phenix.elbow, and phenix.ready_set should run to generate the LINK records (e.g. for a non-natural amino acid that is part of the polypeptide chain), using phenix.elbow's CIF file.
Constraints and restraints in real and reciprocal space
Hydrogens
Use phenix.ready_set to add hydrogens to your PDB file, and (except at ultra-high resolution) the riding hydrogen model in phenix.refine (this is the default so you do not have to specify anything). phenix.ready_set internally uses phenix.elbow for ligands and phenix.reduce for the protein. phenix.pdbtools can also add hydrogens (FIXME: what are the differences?). Hydrogens should not be used in NCS and TLS groups - it might be a good idea to add and not (element H or element D) to all selection strings. See the phenix.refine documentation.
Occupancy
Adding "occupancy" to the "strategy" options will refine the occupancies of those parts of the model that have alternate conformations.
Example:
occupancies { constrained_group { selection = "chain A and resseq 105 and altloc A" selection = "chain B and resseq 105 and altloc B" } }
Essentially, the above selection tells: "alternative conformation A of residue 105 in chain A is coupled with alternative conformation B of (NCS related) residue 105 in chain B". The sum of refined occupancies will be 1 in this case. It is essential that altlocs in both selections are different - this turn the non-bonded interaction off so the residues will get pushed apart.
NCS
Automatic detection of NCS groups:
phenix.refine data.hkl model.pdb main.ncs=True
Manual specification of NCS groups:
phenix.refine data.hkl model.pdb ncs_groups.params main.ncs=True
where ncs_groups.params contains e.g.:
refinement.ncs.restraint_group { reference = chain A selection = chain B selection = chain C } refinement.ncs.restraint_group { reference = chain E selection = chain F }
Secondary structure restraints
phenix.refine model.pdb data.mtz main.secondary_structure_restraints=true
You can find more information about secondary structure restraints in the PHENIX Newsletter (pages 12-17).
Low resolution refinement
Use an existing high resolution model (e.g. in a different spacegroup) for restraining the dihedrals:
phenix.refine data.hkl model.pdb main.reference_model_restraints=True reference_model.file=reference.pdb
The behaviour can be modified with the keywords reference_model.limit (default 15 degrees) and reference_model.sigma (default probably 1 degrees - the current documentation says 1 Angstrom which is probably not right).
In the case where your working model has four chains (A, B, C, D) and your reference model has only chain A, the selections would look like this:
refinement.reference_model.reference_group { reference = chain A selection = chain A } refinement.reference_model.reference_group { reference = chain A selection = chain B } refinement.reference_model.reference_group { reference = chain A selection = chain C } refinement.reference_model.reference_group { reference = chain A selection = chain D }
See the documentation.
TLS
- run your model through TLSMD server to identify TLS domains (it will produce PHENIX friendly TLS groups selections);
http://skuld.bmsc.washington.edu/~tlsmd/
- use these selections for TLS refinement in PHENIX: see http://www.phenix-online.org/documentation/refinement.htm
for example:
phenix.refine model.pdb data.hkl strategy=individual_sites+individual_adp+tls tls_selections.def
with tls_selections.def something like:
refinement.refine { adp { tls = chain 'A' tls = chain 'B' } }
Rigid body
example for file rigid_body.def defining 2 rigid bodies:
refinement.refine.sites { rigid_body = chain 'A' or chain 'B' rigid_body = chain 'L' or chain 'M' }
Fix His/Asn/Gln sidechain orientations
Use
phenix.refine data.hkl model.pdb main.nqh_flips=True
to automatically flip these sidechains to make them better fit the density and/or hydrogen bonding pattern.
Real-space refinement
good writeup at http://cci.lbl.gov/~afonine/rsr.pdf . In short, use
phenix.refine model.pdb data.hkl fix_rotamers=true
It would probably be a good idea to also use main.nqh_flips=True (but maybe this is already integrated into fix_rotamers=true ?)
Atom selection
e.g.
phenix.refine model.pdb data.mtz refine.sites.individual="not (chain A and resseq 123:156)"
or
phenix.refine model.pdb data.mtz strategy=individual_adp adp.individual.iso="chain A and resseq 10:20"
The latter will refine only the B-factors of A10:A20 . It should be noted that the overall B-factor can change by ± a constant. This is because the trace of overall anisotropic scale matrix is subtracted from it and added to all atoms and to Bsol.
Switching off specific interactions
- In specific (rare !) situations one wants to exclude specific interactions. The pdb_interpretation.custom_nonbonded_symmetry_exclusion=<selection> command line keyword was designed for this purpose.
- To switch off the interaction between a specific atom and its environment, e.g. to obtain unbiased (by restraints) estimates of distances, see http://www.phenix-online.org/documentation/refinement.htm#anch80 - you just add restraints of the form:
refinement.geometry_restraints.edits { zn_selection = chain X and resname ZN and resid 200 and name ZN his117_selection = chain X and resname HIS and resid 117 and name NE2 bond { action = *add atom_selection_1 = $zn_selection atom_selection_2 = $his117_selection distance_ideal = 2.1 sigma = 0.02 # use slack=None if you _want_ to restrain, use large slack if not slack = 1 } }
Refinement with mmtbx.lockit
From RWGK's posting to phenixbb on Nov 14, 2010:
We have a tool for quick real-space refinement that's geared towards making the geometry ideal in the end. I'm not sure it is useful in your situation, but may be worth a try. It works like this:
mmtbx.lockit your.pdb your_refine_001_map_coeffs.mtz \ map.coeff_labels.f=2FOFCWT,PH2FOFCWT coordinate_refinement.run=True \ atom_selection='resname LIG'
It works in two stages. First it attempts to maximize the real-space weight allowing for a significant (but not totally unreasonable) distortion of the geometry. This is meant to move the ligand into the density. In the second stage it scales down the "best" real-space weight and runs a number of real-space refinements until the selected atoms do not move anymore. The expected result is nearly ideal geometry.
The procedure is usually very quick. If it turns out to be useful we could integrate it into phenix.refine, to be run after reciprocal-space refinement.
The mmtbx.lockit command is not as user-friendly as phenix.refine. It only works with mtz files, you have to manually specify the mtz labels, and the error messages may be unhelpful. Also be sure there is a valid CRYST1 card in your pdb file.
Maps
phenix.maps - a command line tool to compute various maps
Seems to have no specific documentation. Can do B-factor sharpening for improving low-resolution maps.
phenix.real_space_correlation - compute correlation between two maps
Can work with ensembles of structures. Seems to have no specific documentation. Can also calculate map CC for all atoms or per residue.
phenix.get_cc_mtz_mtz
phenix.fobs_minus_fobs_map - calculate difference density
Seems to have no specific documentation.
phenix.multi_crystal_average
phenix.grow_density - local density improvement
As originally described in Acta Cryst. (1997). D53, 540-543 (in development). Seems to have no specific documentation.
phenix.mtz2map
with output=xplor produces an X-PLOR style map. Adding a PDB file will result in a masked map.
NCS usage
phenix.find_ncs - identification of NCS operators
from protein coordinates (chains), heavy atom coordinates, or a density map
phenix.superpose_maps - transforms maps following a molecular superposition
Seems to have no specific documentation.
phenix.apply_ncs - applying NCS to a molecule to generate all NCS copies
Model analysis and manipulation
phenix.pbdtools - PDB model manipulations and statistics
e.g.
phenix.pdbtools your_model.pdb --show-adp-statistics
will show you complete statistics about B-factors;
phenix.pbdtools your_model.pdb --show-geometry-statistics
will show you complete statistics about stereochemistry,
phenix.pbdtools your_model.pdb set_b_iso=25.3 selection="chain A and resname ALA and name CA"
will set all B=25 for all CA atoms in all ALA residues of chain A.
phenix.pdb_interpretation - PDB bonds, distances, dihedrals, ...
phenix.pdb_interpretation model_1.pdb ligand.cif
will result in a output file model_1.pdb.geo which contains ALL geometry information (bonds, angles, torsions, planarity, non-bonded ...) for each and every atom in your model.
phenix.reduce - tool for adding hydrogens to a PDB model
phenix.superpose_pdbs - Superposition of models
phenix.superpose_ligands - Superposition of ligands
Example files at [1]
phenix.get_cc_mtz_pdb - shift model to find origin
Assuming map_coeffs1.mtz corresponds to model_1.pdb,
phenix.get_cc_mtz_pdb map_coeffs1.mtz model_2.pdb
will create offset.pdb which is a copy of model_2.pdb, adjusted for the origin of map_coeffs_1.mtz, and therefore superimposing on model_1.pdb with space-group symmetry plus allowed origin shifts. This will not change the hand, however.
Validation
phenix.polygon
starts the GUI and runs calculations resulting in a POLYGON drawing of important characteristics of your PDB file in relation to the data
phenix.validate_model and phenix.validate
are also GUI-only
phenix.ramalyze, phenix.rotalyze, and phenix.cbetadev
phenix.clashscore
Other programs
phenix.tls - tool to convert between total and residual ADPs
It can recognize Refmac and phenix.refine formats of TLS records in PDB files.
Tips and Tricks
A handy tip: to check the syntax of a Phenix parameter file (for any program, not just phenix.refine), you can run this command (replacing params.eff with the file of interest):
libtbx.phil params.eff
If it works, it will just print out the parameters - if not, the error message should give some indication where the error occurred.
See also
http://phenix-online.org/presentations/neutron_japan_2009/phenix_japan_part1.pdf
http://cci.lbl.gov/~afonine/for_ak/validation.pdf
- 42 pages of general introduction to structure refinement: [2]
- 45 pages of phenix.refine overview (including extended details about its use from the command line): [3]
- 42 pages of "Some Facts About Maps": [4]
- 50 pages of "Crystallographic Structure Validation": [5]
- 31 pages of introduction to PHENIX: [6]
server producing custom RNA/DNA base pairing restraints
References
- electronic Ligand Builder and Optimization Workbench (eLBOW): a tool for ligand coordinate and restraint generation. Nigel W. Moriarty, Ralf W. Grosse-Kunstleve and Paul D. Adams, ActaCryst. (2009). D65, 1074-1080
- phenix.model_vs_data: a high-level tool for the calculation of crystallographic model and data statistics. Afonine PV, Grosse-Kunstleve RW, Chen VB, Headd JJ, Moriarty NW, Richardson JS, Richardson DC, Urzhumtsev A, Zwart PH, Adams PD. (2010) J Appl Crystallogr. 43, 669-676. [7]