DeltaCC12: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
ΔCC12 is a quantity, that detects datasets/frames, that are non-isomorphous. As described in [https://scripts.iucr.org/cgi-bin/paper?zw5005 Assmann and Diederichs (2016)], Δcc12 is calculated with the σ-τ method. This method is a way to calculate the Pearson correlation coefficient for the special case of two sets of values (intensities) that randomly deviate from their true values, but is not influenced by a random number sequence as shown in [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3457925/ Karplus and Diederichs (2012)]. For the σ-τ method CC12 is calculated for all datasets/frames, which will be called CC12_overall (?) and CC12 is calculated for all datasets/frames except for one dataset i, which is omitted from calculations and denoted as CC12_i. The difference of the two quantities is Δcc12. | |||
: <math>\Delta CC_{1/2}= CC_{1/2 overall}-CC_{1/2 i} </math> | : <math>\Delta CC_{1/2}= CC_{1/2 overall}-CC_{1/2 i} </math> | ||
If | If ΔCC12 is > 0 -CC12overall is bigger than CC12i- that means if omitting dataset i from calculations, a lower CC12 results, which is why we want to keep it. Thus it is improving the whole merged dataset. If ΔCC12 is < 0, -CC12overall is smaller than CC12i- that means that by omitting dataset i from calculations a higher CC12 results, which is why we want to exclude it from calculations, because it is impairing the whole merged dataset. CC12 is calculated by: | ||
: <math>CC_{1/2}=\frac{\sigma^2_{\tau}}{\sigma^2_{\tau}+\sigma^2_{\epsilon}} =\frac{\sigma^2_{y}- \frac{1}{2}\sigma^2_{\epsilon}}{\sigma^2_{y}+ \frac{1}{2}\sigma^2_{\epsilon}} </math> | : <math>CC_{1/2}=\frac{\sigma^2_{\tau}}{\sigma^2_{\tau}+\sigma^2_{\epsilon}} =\frac{\sigma^2_{y}- \frac{1}{2}\sigma^2_{\epsilon}}{\sigma^2_{y}+ \frac{1}{2}\sigma^2_{\epsilon}} </math> | ||
Line 14: | Line 14: | ||
The unbiased sample variance from all averaged intensities of all unique reflections is calculated by: | The unbiased sample variance from all averaged intensities of all unique reflections is calculated by: | ||
<math>\sigma^2_{y} = \frac{1}{ | <math>\sigma^2_{y} = \frac{1}{N-1} \cdot \left ( \sum^N_{i} x^2_i - \frac{\left ( \sum^N_{i}x_{i} \right )^2}{ N} \right ) </math> | ||
With <math>x_{i} </math> , average intensity of all observations from all frames/crystals of one unique reflection i. This is done for all reflections | With <math>x_{i} </math> , average intensity of all observations from all frames/crystals of one unique reflection i. This is done for all reflections N in a resolution shell. | ||
---- | ---- | ||
Line 27: | Line 27: | ||
<math>\sigma^2_{\epsilon i} = \frac{1}{n-1} \cdot \left ( \sum^n_{j} x^2_{j} - \frac{\left ( \sum^n_{j}x_{j} \right )^2}{ n} \right ) \backslash \frac{n}{2} </math> | <math>\sigma^2_{\epsilon i} = \frac{1}{n-1} \cdot \left ( \sum^n_{j} x^2_{j} - \frac{\left ( \sum^n_{j}x_{j} \right )^2}{ n} \right ) \backslash \frac{n}{2} </math> | ||
With <math>x_{j} </math> , a single observation j of all observations n of one reflection i. <math>\sigma^2_{\epsilon i} </math> is then divided by the factor <math>\frac{n}{2} </math>, because the variance of the sample mean (the merged observations) is the quantity of interest. As we are considering CC12, the variance <math>\sigma^2_{\epsilon i} </math> is divided by <math>\frac{n}{2} </math> and not by '''n''' as described in [https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ]. | With <math>x_{j} </math> , a single observation j of all observations n of one reflection i. <math>\sigma^2_{\epsilon i} </math> is then divided by the factor <math>\frac{n}{2} </math>, because the variance of the sample mean (the merged observations) is the quantity of interest. As we are considering CC12, the variance <math>\sigma^2_{\epsilon i} </math> is divided by <math>\frac{n}{2} </math> and not only by '''n''' as described in [https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ], because we are calculating the random errors of the merged intensities of a half-dataset. The single variance terms are then summed up for all reflections n in a resolution shell and divided by N, the total number of unique reflections. | ||
<math>\sum^N_{i} \sigma^2_{\epsilon i} \backslash N </math> | |||
===''' <math>\sigma^2_{\epsilon} </math>''' -weighted=== | ===''' <math>\sigma^2_{\epsilon} </math>''' -weighted=== | ||
to be edited | to be edited |
Revision as of 12:54, 5 September 2018
ΔCC12 is a quantity, that detects datasets/frames, that are non-isomorphous. As described in Assmann and Diederichs (2016), Δcc12 is calculated with the σ-τ method. This method is a way to calculate the Pearson correlation coefficient for the special case of two sets of values (intensities) that randomly deviate from their true values, but is not influenced by a random number sequence as shown in Karplus and Diederichs (2012). For the σ-τ method CC12 is calculated for all datasets/frames, which will be called CC12_overall (?) and CC12 is calculated for all datasets/frames except for one dataset i, which is omitted from calculations and denoted as CC12_i. The difference of the two quantities is Δcc12.
- [math]\displaystyle{ \Delta CC_{1/2}= CC_{1/2 overall}-CC_{1/2 i} }[/math]
If ΔCC12 is > 0 -CC12overall is bigger than CC12i- that means if omitting dataset i from calculations, a lower CC12 results, which is why we want to keep it. Thus it is improving the whole merged dataset. If ΔCC12 is < 0, -CC12overall is smaller than CC12i- that means that by omitting dataset i from calculations a higher CC12 results, which is why we want to exclude it from calculations, because it is impairing the whole merged dataset. CC12 is calculated by:
- [math]\displaystyle{ CC_{1/2}=\frac{\sigma^2_{\tau}}{\sigma^2_{\tau}+\sigma^2_{\epsilon}} =\frac{\sigma^2_{y}- \frac{1}{2}\sigma^2_{\epsilon}}{\sigma^2_{y}+ \frac{1}{2}\sigma^2_{\epsilon}} }[/math]
This requires calculation of [math]\displaystyle{ \sigma^2_{y} }[/math], the variance of the average intensities across the unique reflections of a resolution shell, and [math]\displaystyle{ \sigma^2_{\epsilon} }[/math], the average of all sample variances of the mean across all unique reflections of a resolution shell.
Implementation
[math]\displaystyle{ \sigma^2_{y} }[/math]
The unbiased sample variance from all averaged intensities of all unique reflections is calculated by:
[math]\displaystyle{ \sigma^2_{y} = \frac{1}{N-1} \cdot \left ( \sum^N_{i} x^2_i - \frac{\left ( \sum^N_{i}x_{i} \right )^2}{ N} \right ) }[/math]
With [math]\displaystyle{ x_{i} }[/math] , average intensity of all observations from all frames/crystals of one unique reflection i. This is done for all reflections N in a resolution shell.
[math]\displaystyle{ \sigma^2_{\epsilon} }[/math] - unweighted
The average of all sample variances of the mean across all unique reflections of a resolution shell is obtained by calculating the sample variance of the mean for every unique reflection i by:
[math]\displaystyle{ \sigma^2_{\epsilon i} = \frac{1}{n-1} \cdot \left ( \sum^n_{j} x^2_{j} - \frac{\left ( \sum^n_{j}x_{j} \right )^2}{ n} \right ) \backslash \frac{n}{2} }[/math]
With [math]\displaystyle{ x_{j} }[/math] , a single observation j of all observations n of one reflection i. [math]\displaystyle{ \sigma^2_{\epsilon i} }[/math] is then divided by the factor [math]\displaystyle{ \frac{n}{2} }[/math], because the variance of the sample mean (the merged observations) is the quantity of interest. As we are considering CC12, the variance [math]\displaystyle{ \sigma^2_{\epsilon i} }[/math] is divided by [math]\displaystyle{ \frac{n}{2} }[/math] and not only by n as described in [1], because we are calculating the random errors of the merged intensities of a half-dataset. The single variance terms are then summed up for all reflections n in a resolution shell and divided by N, the total number of unique reflections.
[math]\displaystyle{ \sum^N_{i} \sigma^2_{\epsilon i} \backslash N }[/math]
[math]\displaystyle{ \sigma^2_{\epsilon} }[/math] -weighted
to be edited