How can we judge the quality of a data set? There are several possibilities:

  1. numerical indicators, like R-values, I/sigma and the like
  2. graphical representations

This article serves to demonstrate pathological cases. It collects examples for

  1. problems with the hardware (e.g. detector, beamline, goniostat, beam, cryo)
  2. problems with the crystal
  3. problems with data processing

Hardware problems

Scale factor plot in case of spindle problems

 

The scale factor is printed, in INTEGRATE.LP, for every frame (column 3). This plot shows spikes indicating that the spindle went too fast every 13 frames or so (but the spindle went slow in the adjacent frames). Needless to say, this problem is quite detrimental to data quality.

Mosaicity plots in case of spindle problems

 

The same data set: the mosaicity estimates of individual frames (column 10 in INTEGRATE.LP) is very much influenced by this. The "jumps" in the curve arises because INTEGRATE was run with MAXIMUM_NUMBER_OF_JOBS=8: since each of the 8 jobs uses the orientation matrix from IDXREF for its initial batch, and that matrix does not seem to match the actual orientation, the mosaicity appears high. Only after geometry refinement (green line) is the result reasonable (and thus the intensity estimates will not be affected). The estimate for the second batch of each job is much better, because it uses the orientation obtained from the geometry refinement as a starting point.

Exactly why the IDXREF estimate is off, and if it has something to do with the spindle problem, is unknown.

With MAXIMUM_NUMBER_OF_JOBS=1 the plot would definitely not look like this - it would be much smoother because the next batch of data "knows" about the orientation of the previous one.

 

The same hardware problem, but a different data set: here, the mosaicity estimates of individual frames are less affected, because the initial orientation from IDXREF is good. Oscillations are not seen very well here since the period of the scale factor changes is on the order of 13 frames.

 

Zoomed version of the above. The oscillations are better visible.