Test set: Difference between revisions

Jump to navigation Jump to search
131 bytes added ,  2 June 2015
Reference to Ian Tickle et al. for sigma(Rfree)/Rfree = 1/sqrt(2n)
No edit summary
(Reference to Ian Tickle et al. for sigma(Rfree)/Rfree = 1/sqrt(2n))
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The following is based on a CCP4BB discussion around June 17, 2008 entitled: "How many reflections for Rfree?"
The following is based on a CCP4BB discussion around June 17, 2008 entitled: "How many reflections for [[iucr:Free_R_factor|R<sub>free</sub>]]"


First of all, the test set is that set of reflections put aside for unbiased calculation of statistical quantities, in particular R_free and sigmaA.
First of all, the test set is that set of reflections put aside for unbiased calculation of statistical quantities, in particular [[iucr:Free_R_factor|R<sub>free</sub>]] and sigmaA.


The need to find a good compromise for the size of the test set has been discussed by Axel Brunger in a "Methods in Enzymology" (1997) paper. He writes:
The need to find a good compromise for the size of the test set has been discussed by Axel Brunger in a "Methods in Enzymology" (1997) paper. He writes:
Line 9: Line 9:
  and the need to avoid a deleterious effect on the atomic model by omission of too much experimental data.
  and the need to avoid a deleterious effect on the atomic model by omission of too much experimental data.


==How precise is the estimate of Rfree for a certain number of test set reflections?==
==How precise is the estimate of R<sub>free</sub> for a certain number of test set reflections?==
The estimate for the relative error of R_free is 1/sqrt(n), where n is the size of the test set. So if n is 1000, and the R_free is 31%, you would expect its relative error to be 31%/sqrt(1000), which is about 1%.
The estimate for the relative error of [[iucr:Free_R_factor|R<sub>free</sub>]] is 1/sqrt(2n), where n is the size of the test set (Tickle et al., Acta Cryst. (2000) D56 , 442-450). So if n is 1000, and the [[iucr:Free_R_factor|R<sub>free</sub>]] is 31%, you would expect its relative error to be 31%/sqrt(2000), which is about 0.7%.


I believe this is from a paper of Ian Tickle (FIXME: reference).


==How many reflections do you need to get a good estimate of the sigmaA values (as a function of resolution) needed to calibrate the likelihood target?==
==How many reflections do you need to get a good estimate of the sigmaA values (as a function of resolution) needed to calibrate the likelihood target?==
25

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.

Navigation menu