View source: R/estimate_indirect_corr.R
estimate_indirect_corr | R Documentation |
This function estimates indirect correlations starting from the incomplete
correlation matrix in input. Indirect correlations to be estimated must be indicated
by NA
values in the input correlation matrix.
estimate_indirect_corr(
corrMatStart,
force_estimate = FALSE,
widen_factor = 0.2
)
corrMatStart |
Matrix object which represents a correlation matrix. Indirect correlations to be
estimated must be indicated by |
force_estimate |
Boolean flag. When this flag is TRUE, if the obtained correlation matrix is not positive definite, it is approximated to the nearest positive definite matrix based on the Frobenius norm. Matrix approximation may alter fixed initial correlations. If this flag is set to FALSE (default option), matrix approximation is skipped and a warning message is returned. |
widen_factor |
number between 0 and 1. If there is a unique path, the range for that indirect correlation is computed considering |
Indirect correlations are estimated solving a constrained optimization problem. Starting
from the fixed correlations, a correlation graph is built. Then, for each couple of variables whose indirect
correlation is unknown (i.e. NA
values), all the possible paths among them are considered (without
visiting a node more than once). The cost of each path is computed by multiplying the correlations along it.
The maximum and the minimum costs provide a reasonable range for the indirect correlation value.
If there is not any path between two nodes, that indirect correlation will not be estimated and it will be
automatically set to 0. If there is a unique path, the range for that indirect correlation is computed considering cost +/- widen_factor*cost
, where cost
is the cost of the unique existing path. The default value of widen_factor
is 0.2.
Given the bounds of indirect correlations, a constrained optimization problem is solved by minimizing the negative of minimum eigenvalue of correlation matrix. The starting values for the indirect correlation values are set equal to the middle-point of the computed bounds. If the estimated matrix is not positive definite, user can force a second optimization step in which the previously obtained matrix is approximated to its nearest positive definite matrix. Frobenius norm is used to measure distance between matrices. Note that this step may alter initial fixed correlations.
An indirect correlation between two variables can be estimated only if they are linked by at least one path in the correlation graph. If for all indirect correlations declared does not exist any path, this function prints a warning message and plots the correlation graph to support the debug.
A list containing
corrMatFinal | Matrix object containing the final correlation matrix with indirect correlations estimated |
optim | List of objects containing the outputs provided by the solver (fmincon of pracma
package) used for the constrained optimization. It is returned if the optimization step converges
to a positive definite matrix or if the optimization step fails and force_estimate
is set to FALSE .
|
optim1 | The same of optim . It is returned only when constrained optimization does not converge to
a positive definite correlation matrix and force_estimate is set to TRUE . |
optim2 | List of objects containing the outputs provided by the function nearPD of Matrix
package used to approximate the matrix obtained by solving the constrained optimization problem with the nearest positive
definite correlation matrix. It is returned only when constrained optimization does not converge to a positive definite correlation matrix and force_estimate is set to TRUE . |
optimizationBounds | A matrix object with N(= number of indirect correlations) rows and 4 columns reporting:
|
Alessandro De Carlo alessandro.decarlo01@universitadipavia.it
nearPD
fmincon
estimate_corr_bounds
#define initial correlation structure
c_start <- diag(rep(1,10))
c_start[1,2] <- -0.6
c_start[1,3] <- -0.75
c_start[2,3] <-0.95
c_start[2,4] <- 0.75
c_start[2,6] <- -0.6
c_start[2,8] <- 0.75
c_start[3,4] <- 0.6
c_start[3,8] <-0.75
c_start[4,7] <- 0.6
c_start[4,8]<-0.75
c_start[5,7] <- -0.95
#symmetric correlation structure
c_start <- c_start+t(c_start)-diag(rep(1,ncol(c_start)))
#assign NA to indirect correlations to be estimated
c_start[c_start==0]<-NA
#names of variables
colnames(c_start)<- paste(rep("X",10),1:10,sep = "")
rownames(c_start) <- paste(rep("X",10),1:10,sep = "")
# plot initial correlation matrix
plot_graph_corr(c_start,"Graph of Initial Correlation Matrix")
r<-estimate_indirect_corr(c_start)
#see final output
plot_graph_corr(r$corrMatFinal,'Graph of Final Correlation Matrix')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.