Cc analysis: Difference between revisions

link to binary
(link to binary)
Line 15: Line 15:
# if all CCs are known, the solution is unique in terms of lengths of vectors, and angles between them. However, a rotated (around the origin) or inverted (through the origin) arrangement of the vectors leaves the functional unchanged, because these transformations do not change lengths and angles.  
# if all CCs are known, the solution is unique in terms of lengths of vectors, and angles between them. However, a rotated (around the origin) or inverted (through the origin) arrangement of the vectors leaves the functional unchanged, because these transformations do not change lengths and angles.  
# as long as the problem is over-determined, the vectors can be calculated. Unknown CCs between data sets (e.g. in case of crystallographic data sets that don't have common reflections) can be estimated from the dot product of their vectors. Over-determination means: each data set has to be related (directly or indirectly i.e through others) to any other by at least as many CCs as the desired number of dimensions is.
# as long as the problem is over-determined, the vectors can be calculated. Unknown CCs between data sets (e.g. in case of crystallographic data sets that don't have common reflections) can be estimated from the dot product of their vectors. Over-determination means: each data set has to be related (directly or indirectly i.e through others) to any other by at least as many CCs as the desired number of dimensions is.


== The program ==
== The program ==
<code>cc_analysis</code> calculates the vectors from the pairwise correlation coefficients. The (low) dimension must be specified, and a file with lines specifying the correlation coefficients must be provided.  
<code>cc_analysis</code> calculates the vectors from the pairwise correlation coefficients. The (low) dimension must be specified, and a file with lines specifying the correlation coefficients must be provided.  


  CC_ANALYSIS version 22.10.2018 (K. Diederichs). No redistribution please!
  CC_ANALYSIS version 30.12.2018 (K. Diederichs). No redistribution please!
  cc_analysis -dim <dim> [-b] [-w] [-z] <input.dat> <output.dat>
  cc_analysis -dim <dim> [-b] [-w] [-z] <input.dat> <output.dat>
  <input.dat> has lines with items: i j corr [ncorr]
  <input.dat> has lines with items: i j corr [ncorr]
Line 26: Line 25:
  -w option: calculate weights from  of correlated items (4th item on input line)
  -w option: calculate weights from  of correlated items (4th item on input line)
  -z option: use Fisher z-transformation
  -z option: use Fisher z-transformation
-f option: skip some calculations (fast)
-m <iters> option: use <iters> (default 20) least-squares iterations
-t <threads> option: use <threads> (default 8) threads
Notes:
Notes:
* the number of vectors must be > 2*(low dimension). Typical number of dimensions is 2 or 3, but depending on the problem it could of course be much more.
* the number of vectors must be > 2*(low dimension). Typical number of dimensions is 2 or 3, but depending on the problem it could of course be much more.
Line 92: Line 95:
     5  0.8626  0.1361  0.8733  0.1564
     5  0.8626  0.1361  0.8733  0.1564


The output is: <vector #> <x> <y> <length> <angle> for each vector.
The output is: <vector #> <x> <y> <length> <angle> for each vector, in this 2-dimensional case; equivalently for higher dimensions.
2,652

edits