HGMND: Heterogeneous Graphical Model for Non-Negative Data
In HGMND: Heterogeneous Graphical Model for Non-Negative Data

Description Usage Arguments Details Value References Examples

View source: R/HGMND.R

The HGMND is the main function to estimate the conditional dependence matrices of variables from different datasets.

HGMND(x,
     setting,
     h,
     centered,
     mat.adj,
     lambda1,
     lambda2,
     gamma   = 1,
     maxit   = 200,
     tol     = 1e-5,
     silent  = TRUE)

`x`	a list of data matrices sharing the same variables in their columns.
`setting`	a string that indicates the data distribution, must be chosen from `"gaussian"`, `"gamma"`, `"exp"`.
`h`	the function `h(x)` used in the h-generalized score matching loss, which returns a list containing `hx = h(x)` and its derivative `hpx = hp(x)`, where `x` is the data matrix. See details for more information.
`centered`	logical, if `centered = TRUE`, the data distribution is assumed centered with η = 0.
`mat.adj`	the adjacency matrix of the network among the multiple datasets, containing only 0s and 1s. Only the upper-triangle of `mat.adj` is used.
`lambda1`	the non-negative tuning parameter which controls the sparsity level of the estimation.
`lambda2`	the non-negative tuning parameter which controls the homogeneity level of the estimation.
`gamma`	the step size parameter in ADMM. Default to `1`.
`maxit`	maximum number of iterations. Default to `200`.
`tol`	tolerance in the convergence criterion. Default to `1e-5`.
`silent`	logical, if `silent = FALSE`, the prime and dual feasibility and the time used in each ADMM iteration will show on the console.

h can be generated by function get_h_hp in package genscore. See more details in Yu S., Lin, L. & Gilks, W. (2020). genscore: Generalized Score Matching Estimators. R package version 1.0.2. https://CRAN.R-project.org/package=genscore and Yu, S., Drton, M., & Shojaie, A. (2019). Generalized Score Matching for Non-Negative Data. J. Mach. Learn. Res., 20, 76-1.

Suppose we have M datasets, and we demand the network among them to be connected and have M - 1 edges, hence acyclic. This is sufficient for computational feasibility, which however does not prevent our method from being applicable to diverse network structures.

The HGMND method returns the estimated conditional dependence matrix of each dataset.

`Theta`	the 3-dimensional array containing the estimation of the multiple conditional dependence matrices. The 3rd dimension represents different datasets.
`M`	an integer, the number of datasets.
`P`	an integer, dimension of the random vector of interest.

Yu, S., Drton, M., & Shojaie, A. (2019). Generalized Score Matching for Non-Negative Data. J. Mach. Learn. Res., 20, 76-1.

Yu S., Lin, L. & Gilks, W. (2020). genscore: Generalized Score Matching Estimators. R package version 1.0.2. https://CRAN.R-project.org/package=genscore.

# This is an example of HGMND with simulated data
data(HGMND_SimuData)
h              <- genscore::get_h_hp("mcp", 1, 5)
HGMND_SimuData <- lapply(HGMND_SimuData, function(x) scale(x, center = FALSE))
mat.chain      <- diag(length(HGMND_SimuData))
diag(mat.chain[-nrow(mat.chain), -1]) <- 1

result <- HGMND(x        = HGMND_SimuData,
                setting  = "gaussian",
                h        = h,
                centered = FALSE,
                mat.adj  = mat.chain,
                lambda1  = 0.086,
                lambda2  = 3.6,
                gamma    = 1,
                tol      = 1e-3,
                silent  = TRUE)
Theta       <- result[["Theta"]]