CC1/2: Difference between revisions

25 bytes removed ,  13 April 2019
→‎\sigma^2_{\epsilon}: change nomenclature from sigma to s
(→‎\sigma^2_{\epsilon}: change nomenclature from sigma to s)
Line 14: Line 14:
With <math>x_{j,i} </math> , a single observation j of all observations n of one reflection i, the average of all sample variances of the mean across all unique reflections of a resolution shell is obtained by calculating the unbiased sample variance of the mean for every unique reflection i by:
With <math>x_{j,i} </math> , a single observation j of all observations n of one reflection i, the average of all sample variances of the mean across all unique reflections of a resolution shell is obtained by calculating the unbiased sample variance of the mean for every unique reflection i by:


<math>\sigma^2_{\epsilon i} =  \frac{1}{n_{i}-1} \cdot \left ( \sum^{n_{i}}_{j} x^2_{j,i} - \frac{\left ( \sum^{n_{i}}_{j}x_{j,i} \right )^2}{n_{i}} \right )    / \frac{n_{i}}{2} </math>
<math>s^2_{\epsilon i} =  \frac{1}{n_{i}-1} \cdot \left ( \sum^{n_{i}}_{j} x^2_{j,i} - \frac{\left ( \sum^{n_{i}}_{j}x_{j,i} \right )^2}{n_{i}} \right )    / \frac{n_{i}}{2} </math>


<math>\sigma^2_{\epsilon i} </math> is divided by the factor  <math>\frac{n}{2} </math>, because the variance of the sample mean (intensities of the merged observations) is the quantity of interest. The division by '''n/2''' takes care of providing the variance of the mean ([https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ]) (merged) intensity of the '''half'''-datasets, as defined in [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3457925/ Karplus and Diederichs (2012)]. These "variances of means" are averaged over all unique reflections of the resolution shell:
<math>s^2_{\epsilon i} </math> is divided by the factor  <math>\frac{n}{2} </math>, because the variance of the sample mean (intensities of the merged observations) is the quantity of interest. The division by '''n/2''' takes care of providing the variance of the mean ([https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Variance_of_the_sample_mean ]) (merged) intensity of the '''half'''-datasets, as defined in [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3457925/ Karplus and Diederichs (2012)]. These "variances of means" are averaged over all unique reflections of the resolution shell:
   
   
<math>\sigma^2_{\epsilon}=\sum^N_{i} \sigma^2_{\epsilon i} / N </math>  
<math>\sigma^2_{\epsilon}=\sum^N_{i} \sigma^2_{\epsilon i} / N </math>  
Line 26: Line 26:
If the standard deviations <math>\sigma_{int\_j,i} </math> for the single observations are considered as weights for the CC<sub>1/2</sub> calculation, with <math>w_{j,i}=\frac{1}{\sigma_{int\_j,i}^2} </math> the unbiased '''weighted''' sample variance of the mean for every unique reflection i is obtained by:  
If the standard deviations <math>\sigma_{int\_j,i} </math> for the single observations are considered as weights for the CC<sub>1/2</sub> calculation, with <math>w_{j,i}=\frac{1}{\sigma_{int\_j,i}^2} </math> the unbiased '''weighted''' sample variance of the mean for every unique reflection i is obtained by:  


<math>\sigma^2_{\epsilon i\_w} =  \frac{n_{i}}{n_{i}-1} \cdot \left ( \frac{\sum^{n_{i}}_{j}w_{j,i} x^2_{j,i}}{\sum^{n_{i}}_{j}w_{j,i}} -\left ( \frac{ \sum^{n_{i}}_{j}w_{j,i}x_{j,i} }{\sum^{n_{i}}_{j}w_{j,i}}\right )^2 \right )    / \frac{n_{i}}{2} </math>
<math>s^2_{\epsilon i\_w} =  \frac{n_{i}}{n_{i}-1} \cdot \left ( \frac{\sum^{n_{i}}_{j}w_{j,i} x^2_{j,i}}{\sum^{n_{i}}_{j}w_{j,i}} -\left ( \frac{ \sum^{n_{i}}_{j}w_{j,i}x_{j,i} }{\sum^{n_{i}}_{j}w_{j,i}}\right )^2 \right )    / \frac{n_{i}}{2} </math>


These " weighted variances of means" are averaged over all unique reflections of the resolution shell:
These " weighted variances of means" are averaged over all unique reflections of the resolution shell:


<math>\sigma^2_{\epsilon_w}=\sum^N_{i} \sigma^2_{\epsilon i\_w} / N </math>  
<math>s^2_{\epsilon_w}=\sum^N_{i} s^2_{\epsilon i\_w} / N </math>  


It should be noted that it is not straightforward to define the correct way to calculate a weighted variance (and the weighted variance of the mean). The formula <math>s^2_w =  \frac{n_{i}}{n_{i}-1} \cdot \left ( \frac{\sum^{n_{i}}_{j}w_{j,i} x^2_{j,i}}{\sum^{n_{i}}_{j}w_{j,i}} -\left ( \frac{ \sum^{n_{i}}_{j}w_{j,i}x_{j,i} }{\sum^{n_{i}}_{j}w_{j,i}}\right )^2 \right )</math> is - after some manipulation - the same as that found at [https://stats.stackexchange.com/questions/6534/how-do-i-calculate-a-weighted-standard-deviation-in-excel],[https://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf].
It should be noted that it is not straightforward to define the correct way to calculate a weighted variance (and the weighted variance of the mean). The formula <math>s^2_w =  \frac{n_{i}}{n_{i}-1} \cdot \left ( \frac{\sum^{n_{i}}_{j}w_{j,i} x^2_{j,i}}{\sum^{n_{i}}_{j}w_{j,i}} -\left ( \frac{ \sum^{n_{i}}_{j}w_{j,i}x_{j,i} }{\sum^{n_{i}}_{j}w_{j,i}}\right )^2 \right )</math> is - after some manipulation - the same as that found at [https://stats.stackexchange.com/questions/6534/how-do-i-calculate-a-weighted-standard-deviation-in-excel],[https://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf].
2,684

edits