Cc analysis: Difference between revisions

show and define formula which is minimized
mNo edit summary
(show and define formula which is minimized)
Line 3: Line 3:
Since the data sets are composed of many measurements, they could be thought of as residing in a high-dimensional space: in case of crystallography, that dimension is the number of unique reflections; in case of images, the number of pixels.   
Since the data sets are composed of many measurements, they could be thought of as residing in a high-dimensional space: in case of crystallography, that dimension is the number of unique reflections; in case of images, the number of pixels.   


As the result (the vectors) are in low-dimensional space, and the data sets reside in high-dimensional space, the procedure may be considered as ''multidimensional scaling'' - there are other procedures in multidimensional scaling, but this particular one has first been described in [http://journals.iucr.org/d/issues/2017/04/00/rr5141/index.html Diederichs, Acta D (2017)]. Alternatively, we can think of the procedure as ''unsupervised learning'', because it "learns" from the given CCs, and predicts the unknown CCs - or rather, the relations of even those data sets that have nothing (crystallography: no reflections; imaging: no pixels) in common.
If N is the number of data sets and <math>cc_{ij}</math> denotes the correlation coefficients between data sets i and j, cc_analysis minimizes
<math>\Phi(\mathbf{x} )=\sum_{i=1}^{N-1}\sum_{j=i+1}^{N}\left(cc_{ij}-x_{i}\cdot x_{j}\right)^{2}</math>
 
as a function of the vector <math>\bf{x}</math>, the column vector of the N low-dimensional vectors <math>\it{\{{x_{k}\}}}</math>. This can be performed by minimizing from random starting positions, or more elegantly and efficiently by obtaining starting positions through Eigen decomposition after estimating the missing values of the matrix of correlation coefficients.
 
As the resulting vectors x are in low-dimensional space, and the data sets reside in high-dimensional space, the procedure may be considered as ''multidimensional scaling'' - there are other procedures in multidimensional scaling, but this particular one has first been described in [http://journals.iucr.org/d/issues/2017/04/00/rr5141/index.html Diederichs, Acta D (2017)]. Alternatively, we can think of the procedure as ''unsupervised learning'', because it "learns" from the given CCs, and predicts the unknown CCs - or rather, the relations of even those data sets that have nothing (crystallography: no reflections; imaging: no pixels) in common.


== Properties of cc_analysis ==
== Properties of cc_analysis ==
2,652

edits