CC1/2: Difference between revisions

Line 10:

== why CC<sub>1/2</sub> can be negative ==

There is a mathematical reason, explained in §4.1 of [https://cms.uni-konstanz.de/index.php?eID=tx_nawsecuredl&u=0&g=0&t=1475179096&hash=5cf64234a23a794a1894c5408384c57208d7b602&file=fileadmin/biologie/ag-strucbio/pdfs/Assman2016_JApplCryst.pdf Assmann, G., Brehm, W. and Diederichs, K. (2016) Identification of rogue datasets in serial crystallography (2016) J. Appl. Cryst. 49, 1021-1028.]

==CC<sub>1/2</sub> calculation==

CC12 is calculated by:

: <math>CC_{1/2}=\frac{\sigma^2_{\tau}}{\sigma^2_{\tau}+\sigma^2_{\epsilon}} =\frac{\sigma^2_{y}- \frac{1}{2}\sigma^2_{\epsilon}}{\sigma^2_{y}+ \frac{1}{2}\sigma^2_{\epsilon}} </math>

This requires calculation of <math>\sigma^2_{y} </math>, the variance of the average intensities across the unique reflections of a resolution shell, and <math>\sigma^2_{\epsilon} </math>, the average of all sample variances of the mean across all unique reflections of a resolution shell.

== Implementation ==

===''' <math>\sigma^2_{\epsilon} </math>''' - unweighted===

The average of all sample variances of the mean across all unique reflections of a resolution shell is obtained by calculating the sample variance of the mean for every unique reflection i by:

<math>\sigma^2_{\epsilon i} = \frac{1}{n-1} \cdot \left ( \sum^n_{j} x^2_{j} - \frac{\left ( \sum^n_{j}x_{j} \right )^2}{ n} \right ) / \frac{n}{2} </math>

With <math>x_{j} </math> , a single observation j of all observations n of one reflection i. <math>\sigma^2_{\epsilon i} </math> is then divided by the factor <math>\frac{n}{2} </math>, because the variance of the sample mean (the merged observations) is the quantity of interest. The division by n/2 takes care of providing the variance of the mean (merged) intensity of the half-datasets, as defined in [https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ]. These "variances of means" are averaged over all unique reflections of the resolution shell:

<math>\sum^N_{i} \sigma^2_{\epsilon i} / N </math>

----

===''' <math>\sigma^2_{y} </math>'''===

The unbiased sample variance from all averaged intensities of all unique reflections is calculated by:

<math>\sigma^2_{y} = \frac{1}{N-1} \cdot \left ( \sum^N_{i} \overline{x}^2 - \frac{\left ( \sum^N_{i} \overline{x} \right )^2}{ N} \right ) </math>

With <math>\overline{x}= \sum^n_{j} x_{j}</math> , average intensity of all observations from all frames/crystals of one unique reflection i. This is done for all reflections N in a resolution shell.

== Example ==

An example is shown for a very simplified data file (unmerged ASCII.HKL). Only two frames/crystals are looked at and the diffraction pattern also consists only of two unique reflections with each three observations for every unique reflection.

<pre>

First reflection with 6 observations:

h k l int σ(int) #datset

2 0 0 9.156E+02 3.686E+00 1

0 2 0 5.584E+02 3.093E+00 1

0 0 2 6.301E+02 2.405E+01 1

2 0 0 9.256E+02 3.686E+00 2

0 2 0 2.584E+02 3.093E+00 2

0 0 2 7.301E+02 2.405E+01 2

</pre>

<math>x_{i} </math> , the average intensity of all observations from all frames/crystals of this reflection = 669.6999

<math>\sigma^2_{\epsilon i} </math>, the unbiased sample variance of the mean of all observations of this unique reflection i = 20848.2198 (62544.6597/(n/2))

<pre>

Second reflection with 6 observations:

h k l int σ(int) #datset

1 1 2 2.395E+01 8.932E+01 1

1 2 1 9.065E+01 7.407E+00 1

2 1 1 5.981E+01 9.125E+00 1

1 1 2 3.395E+01 8.932E+01 2

1 2 1 9.065E+01 7.407E+00 2

2 1 1 1.608E+01 2.215E+01 2

</pre>

<math>x_{i} </math> , the average intensity of all observations from all frames/crystals of this reflection = 52.5150

<math>\sigma^2_{\epsilon i} </math>, the unbiased sample variance of the mean of all observations of this unique reflection i = 363.3267 (1089.9803/(n/2))

<math>\sigma^2_{\epsilon} </math> , the average of all the <math>\sigma^2_{\epsilon i} </math> = 10605.7733

<math>\sigma^2_{y} </math>, the variance of all the averaged intensities = 190458.6533

As a result of these calculations CC12 =

@@ Line 10: / Line 10: @@
 == why CC<sub>1/2</sub> can be negative ==
 There is a mathematical reason, explained in §4.1 of [https://cms.uni-konstanz.de/index.php?eID=tx_nawsecuredl&u=0&g=0&t=1475179096&hash=5cf64234a23a794a1894c5408384c57208d7b602&file=fileadmin/biologie/ag-strucbio/pdfs/Assman2016_JApplCryst.pdf Assmann, G., Brehm, W. and Diederichs, K. (2016) Identification of rogue datasets in serial crystallography (2016) J. Appl. Cryst. 49, 1021-1028.]
+==CC<sub>1/2</sub> calculation==
+CC12 is calculated by:
+: <math>CC_{1/2}=\frac{\sigma^2_{\tau}}{\sigma^2_{\tau}+\sigma^2_{\epsilon}} =\frac{\sigma^2_{y}- \frac{1}{2}\sigma^2_{\epsilon}}{\sigma^2_{y}+ \frac{1}{2}\sigma^2_{\epsilon}} </math>
+This requires calculation of <math>\sigma^2_{y} </math>, the variance of the average intensities across the unique reflections of a resolution shell, and <math>\sigma^2_{\epsilon} </math>, the average of all sample variances of the mean across all unique reflections of a resolution shell.
+== Implementation ==
+===''' <math>\sigma^2_{\epsilon} </math>''' - unweighted===
+The average of all sample variances of the mean across all unique reflections of a resolution shell is obtained by calculating the sample variance of the mean for every unique reflection i by:
+<math>\sigma^2_{\epsilon i} =  \frac{1}{n-1} \cdot \left ( \sum^n_{j} x^2_{j} - \frac{\left ( \sum^n_{j}x_{j} \right )^2}{ n} \right )     / \frac{n}{2} </math>
+With <math>x_{j} </math> , a single observation j of all observations n of one reflection i. <math>\sigma^2_{\epsilon i} </math> is then divided by the factor  <math>\frac{n}{2} </math>, because the variance of the sample mean (the merged observations) is the quantity of interest. The division by n/2 takes care of providing the variance of the mean (merged) intensity of the half-datasets, as defined in [https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ]. These "variances of means" are averaged over all unique reflections of the resolution shell:
+<math>\sum^N_{i} \sigma^2_{\epsilon i} / N </math>
+----
+===''' <math>\sigma^2_{y} </math>'''===
+The unbiased sample variance from all averaged intensities of all unique reflections is calculated by:
+<math>\sigma^2_{y} = \frac{1}{N-1} \cdot \left ( \sum^N_{i} \overline{x}^2 - \frac{\left ( \sum^N_{i} \overline{x} \right )^2}{ N} \right ) </math>
+With <math>\overline{x}= \sum^n_{j} x_{j}</math> , average intensity of all observations from all frames/crystals of one unique reflection i. This is done for all reflections N in a resolution shell.
+== Example ==
+An example is shown for a very simplified data file (unmerged ASCII.HKL). Only two frames/crystals are looked at and the diffraction pattern also consists only of two unique reflections with each three observations for every unique reflection.
+<pre>
+First reflection with 6 observations:
+     h     k     l       int     σ(int)  #datset
+     0     0  9.156E+02  3.686E+00   1
+     2     0  5.584E+02  3.093E+00   1
+     0     2  6.301E+02  2.405E+01   1
+     0     0  9.256E+02  3.686E+00   2
+     2     0  2.584E+02  3.093E+00   2
+     0     2  7.301E+02  2.405E+01   2
+</pre>
+<math>x_{i} </math> , the average intensity of all observations from all frames/crystals of this reflection = 669.6999
+<math>\sigma^2_{\epsilon i} </math>, the unbiased sample variance of the mean of all observations of this unique reflection i = 20848.2198 (62544.6597/(n/2))
+<pre>
+Second reflection with 6 observations:
+     h     k     l       int     σ(int)  #datset
+     1     2  2.395E+01  8.932E+01   1
+     2     1  9.065E+01  7.407E+00   1
+     1     1  5.981E+01  9.125E+00   1
+     1     2  3.395E+01  8.932E+01   2
+     2     1  9.065E+01  7.407E+00   2
+     1     1  1.608E+01  2.215E+01   2
+</pre>
+<math>x_{i} </math> , the average intensity of all observations from all frames/crystals of this reflection = 52.5150
+<math>\sigma^2_{\epsilon i} </math>, the unbiased sample variance of the mean of all observations of this unique reflection i = 363.3267 (1089.9803/(n/2))
+<math>\sigma^2_{\epsilon} </math> , the average of all the <math>\sigma^2_{\epsilon i} </math> = 10605.7733
+<math>\sigma^2_{y} </math>, the variance of all the averaged intensities = 190458.6533
+As a result of these calculations CC12 =