Cc analysis: Difference between revisions

Cc analysis (view source)

Revision as of 11:53, 5 July 2022

2,131 bytes added , 5 July 2022

→‎Example

Kay

Bureaucrats

2,652

edits

@@ Line 15: / Line 15: @@
 # if all CCs are known, the solution is unique in terms of lengths of vectors, and angles between them. However, a rotated (around the origin) or inverted (through the origin) arrangement of the vectors leaves the functional unchanged, because these transformations do not change lengths and angles.
 # as long as the problem is over-determined, the vectors can be calculated. Unknown CCs between data sets (e.g. in case of crystallographic data sets that don't have common reflections) can be estimated from the dot product of their vectors. Over-determination means: each data set has to be related (directly or indirectly i.e through others) to any other by at least as many CCs as the desired number of dimensions is.
 == The program ==
 <code>cc_analysis</code> calculates the vectors from the pairwise correlation coefficients. The (low) dimension must be specified, and a file with lines specifying the correlation coefficients must be provided.
-  CC_ANALYSIS version 22.10.2018 (K. Diederichs). No redistribution please!
+  CC_ANALYSIS version 30.12.2018 (K. Diederichs). No redistribution please!
   cc_analysis -dim <dim> [-b] [-w] [-z] <input.dat> <output.dat>
   <input.dat> has lines with items: i j corr [ncorr]
@@ Line 26: / Line 25: @@
   -w option: calculate weights from  of correlated items (4th item on input line)
   -z option: use Fisher z-transformation
+ -f option: skip some calculations (fast)
+ -m <iters> option: use <iters> (default 20) least-squares iterations
+ -t <threads> option: use <threads> (default 8) threads
 Notes:
 * the number of vectors must be > 2*(low dimension). Typical number of dimensions is 2 or 3, but depending on the problem it could of course be much more.
-A Linux binary is available [ftp://strucbio.biologie.uni-konstanz.de/pub/cc_analysis].
+Python code is available [https://strucbio.biologie.uni-konstanz.de/pub/cc_analysis.py] under GPL.
 == Example ==
@@ Line 91: / Line 94: @@
   0.9760  0.0087  0.9760  0.0089
   0.8626  0.1361  0.8733  0.1564
+</pre>
+The output is: <vector #> <x> <y> <length> <angle> for each vector, in this 2-dimensional case; equivalently for higher dimensions.
-The output is: <vector #> <x> <y> <length> <angle> for each vector.
+The Python code produces the following output:
+<pre>
+bash-4.2$ python /usr/local/bin/cc_analysis.py 2 cc.dat
+===
+Correlation matrix parsed from infile:
+[[   nan 0.017  0.0222 0.0233 0.0226]
+ [0.017     nan 0.7026 0.7287 0.6241]
+ [0.0222 0.7026    nan 0.9131 0.8049]
+ [0.0233 0.7287 0.9131    nan 0.8432]
+ [0.0226 0.6241 0.8049 0.8432    nan]]
+===
+Correction factor for 2nd and higher eigenvalue(s):
+.8000
+===
+Interpretation of correlation matrix as dot product matrix:
+---
+all h_i by iterative approach:
+initial values:
+[0.1459 0.7198 0.7815 0.7919 0.7574]
+refinement by iteration:
+#13: [0.0242 0.7414 0.9389 0.9798 0.8551]
+===
+Uncorrected eigenvalue(s):
+used:
+[3.1228 0.0126]
+unused:
+[ 0.0045 -0.0004 -0.0167]
+---
+Corrected eigenvalue(s):
+used:
+[3.1228 0.0158]
+iter      RMS  max_chg  rms_chg
+  0.00345        -        -
+  0.00241 -0.04680  0.00403
+  0.00127  0.01783  0.00275
+  0.00057  0.01297  0.00162
+  0.00029  0.00570  0.00073
+  0.00023  0.00182  0.00026
+  0.00023 -0.00047  0.00008
+  0.00022 -0.00042  0.00004
+  0.00022 -0.00042  0.00004
+  0.00022 -0.00045  0.00005
+  0.00022 -0.00045  0.00005
+  0.00022 -0.00045  0.00005
+  0.00021 -0.00044  0.00005
+  0.00021 -0.00043  0.00005
+  0.00021 -0.00043  0.00005
+  0.00021 -0.00042  0.00004
+  0.00021 -0.00042  0.00004
+  0.00020 -0.00042  0.00004
+  0.00020 -0.00042  0.00004
+  0.00020 -0.00042  0.00004
+  0.00020 -0.00042  0.00004
+  0.0241 -0.0098
+  0.7484  0.1425
+  0.9359  0.0150
+  0.9758 -0.0112
+  0.8624 -0.1501
+===
+Finished outputting 2-dimensional representative vectors! =)
+</pre>
+The 5 lines at the bottom give the solution. The coordinates agree with those of the Fortran program within a rms deviation of 0.0055; however they are mirrored across the x axis and thus represent an inverted solution.

Cc analysis: Difference between revisions

Cc analysis (view source)

Revision as of 11:53, 5 July 2022

Navigation menu

Search