Home

/

GitHub

/

AlessandroDeCarlo27/mvlognCorrEst

/

estimate_indirect_corr: Estimation of Indirect Correlations

estimate_indirect_corr: Estimation of Indirect Correlations
In AlessandroDeCarlo27/mvlognCorrEst: Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

View source: R/estimate_indirect_corr.R

estimate_indirect_corr

R Documentation

Estimation of Indirect Correlations

Description

This function estimates indirect correlations starting from the incomplete correlation matrix in input. Indirect correlations to be estimated must be indicated by NA values in the input correlation matrix.

Usage

estimate_indirect_corr(
  corrMatStart,
  force_estimate = FALSE,
  widen_factor = 0.2
)

Arguments

`corrMatStart`	Matrix object which represents a correlation matrix. Indirect correlations to be estimated must be indicated by `NA`. This matrix must be symmetric, thus it must contain at least two `NA` values.
`force_estimate`	Boolean flag. When this flag is TRUE, if the obtained correlation matrix is not positive definite, it is approximated to the nearest positive definite matrix based on the Frobenius norm. Matrix approximation may alter fixed initial correlations. If this flag is set to FALSE (default option), matrix approximation is skipped and a warning message is returned.
`widen_factor`	number between 0 and 1. If there is a unique path, the range for that indirect correlation is computed considering `cost +/- widen_factor*cost`, where `cost` is the cost of the unique existing path. Default value is 0.2.

Details

Indirect correlations are estimated solving a constrained optimization problem. Starting from the fixed correlations, a correlation graph is built. Then, for each couple of variables whose indirect correlation is unknown (i.e. NA values), all the possible paths among them are considered (without visiting a node more than once). The cost of each path is computed by multiplying the correlations along it. The maximum and the minimum costs provide a reasonable range for the indirect correlation value. If there is not any path between two nodes, that indirect correlation will not be estimated and it will be automatically set to 0. If there is a unique path, the range for that indirect correlation is computed considering cost +/- widen_factor*cost, where cost is the cost of the unique existing path. The default value of widen_factor is 0.2.

Given the bounds of indirect correlations, a constrained optimization problem is solved by minimizing the negative of minimum eigenvalue of correlation matrix. The starting values for the indirect correlation values are set equal to the middle-point of the computed bounds. If the estimated matrix is not positive definite, user can force a second optimization step in which the previously obtained matrix is approximated to its nearest positive definite matrix. Frobenius norm is used to measure distance between matrices. Note that this step may alter initial fixed correlations.

An indirect correlation between two variables can be estimated only if they are linked by at least one path in the correlation graph. If for all indirect correlations declared does not exist any path, this function prints a warning message and plots the correlation graph to support the debug.

Value

A list containing

`corrMatFinal`	Matrix object containing the final correlation matrix with indirect correlations estimated

`optim`	List of objects containing the outputs provided by the solver (`fmincon` of `pracma` package) used for the constrained optimization. It is returned if the optimization step converges to a positive definite matrix or if the optimization step fails and `force_estimate` is set to `FALSE`.

`optim1`	The same of `optim`. It is returned only when constrained optimization does not converge to a positive definite correlation matrix and `force_estimate` is set to `TRUE`.

`optim2`	List of objects containing the outputs provided by the function `nearPD` of `Matrix` package used to approximate the matrix obtained by solving the constrained optimization problem with the nearest positive definite correlation matrix. It is returned only when constrained optimization does not converge to a positive definite correlation matrix and `force_estimate` is set to `TRUE`.

`optimizationBounds`	A matrix object with N(= number of indirect correlations) rows and 4 columns reporting: `var1`: numerical index of `X1`, the first variable of the indirect correlation couple `var2`: numerical index of `X2`, the second variable of the indirect correlation couple `lower`: lower bound for the range of indirect correlation between `var1` and `var2` `upper`: upper bound for the range of indirect correlation between `var1` and `var2`.

Author(s)

Alessandro De Carlo alessandro.decarlo01@universitadipavia.it

Examples

#define initial correlation structure
c_start <- diag(rep(1,10))
c_start[1,2] <- -0.6
c_start[1,3] <- -0.75
c_start[2,3] <-0.95
c_start[2,4] <- 0.75
c_start[2,6] <- -0.6
c_start[2,8] <- 0.75
c_start[3,4] <- 0.6
c_start[3,8] <-0.75
c_start[4,7] <- 0.6
c_start[4,8]<-0.75
c_start[5,7] <- -0.95
#symmetric correlation structure
c_start <- c_start+t(c_start)-diag(rep(1,ncol(c_start)))
#assign NA to indirect correlations to be estimated
c_start[c_start==0]<-NA
#names of variables
colnames(c_start)<- paste(rep("X",10),1:10,sep = "")
rownames(c_start) <- paste(rep("X",10),1:10,sep = "")
# plot initial correlation matrix
plot_graph_corr(c_start,"Graph of Initial Correlation Matrix")
r<-estimate_indirect_corr(c_start)
#see final output
plot_graph_corr(r$corrMatFinal,'Graph of Final Correlation Matrix')

AlessandroDeCarlo27/mvlognCorrEst documentation built on March 23, 2023, 10:11 a.m.

AlessandroDeCarlo27/mvlognCorrEst index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AlessandroDeCarlo27/mvlognCorrEst
Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

estimate_indirect_corr: Estimation of Indirect Correlations
In AlessandroDeCarlo27/mvlognCorrEst: Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

Estimation of Indirect Correlations

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to estimate_indirect_corr in AlessandroDeCarlo27/mvlognCorrEst...

R Package Documentation

Browse R Packages

We want your feedback!

AlessandroDeCarlo27/mvlognCorrEst Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

estimate_indirect_corr: Estimation of Indirect Correlations In AlessandroDeCarlo27/mvlognCorrEst: Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

Estimation of Indirect Correlations

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to estimate_indirect_corr in AlessandroDeCarlo27/mvlognCorrEst...

R Package Documentation

Browse R Packages

We want your feedback!

AlessandroDeCarlo27/mvlognCorrEst
Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations

estimate_indirect_corr: Estimation of Indirect Correlations
In AlessandroDeCarlo27/mvlognCorrEst: Sampling from a multivariate Log-Normal distribution using Log-Normal parameters and Estimation of Indirect Correlations