loggle_cv: A function to learn time-varying graphical models via cross...
In loggle: Local Group Graphical Lasso Estimation

Description Usage Arguments Details Value Author(s) References See Also Examples

This function is to efficiently conduct model selection via cross validation for learning time-varying graphical models. It conducts grid search for optimal h (bandwidth in kernel smoothed sample covariance/correlation matrix), d (width of neighborhood centered at each position where a graph is estimated) and lambda (tuning parameter of lasso penalty at each position where a graph is estimated).

loggle.cv(X, pos = 1:ncol(X), 
h.list = seq(0.1, 0.3, 0.05), d.list = c(0, 0.001, 0.01, 
0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.25, 0.3, 1), 
lambda.list = seq(0.15, 0.35, 0.02), cv.fold = 5, 
fit.type = "pseudo", return.select = TRUE,
select.type = "all_flexible", cv.vote.thres = 0.8, 
early.stop.thres = 5, epi.abs = 1e-4, epi.rel = 1e-2, 
max.step = 500, detrend = TRUE, fit.corr = TRUE, 
h.correct = TRUE, num.thread = 1, print.detail = TRUE)

`X`	a p by N data matrix containing observations on a time grid ranging from 0 to 1: p – number of variables, N – number of time points. The nominal time for the kth time point is (k-1)/(N-1)
`pos`	a vector constitutes a subset of {1, 2, ..., N}: indices of time points where graphs are estimated, default = 1:N
`h.list`	a vector with values between 0 and 1: a grid of bandwidths in kernel smoothed sample covariance/correlation matrix, default = seq(0.1, 0.3, 0.05)
`d.list`	a vector with values between 0 and 1: a grid of widths of neighborhood centered at each time point specified by `pos`, default = c(0, 0.001, 0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.25, 0.3, 1)
`lambda.list`	a vector: a grid of tuning parameters of lasso penalty at each time point specified by `pos`, default = seq(0.15, 0.35, 0.02)
`cv.fold`	a scalar: number of cross-validation folds, default = 5
`fit.type`	a string: "likelihood" – likelihood estimation, "pseudo" – pseudo likelihood estimation, or "space" – sparse partial correlation estimation, default = "pseudo"
`return.select`	logic: if TRUE, return model selection result, default = TRUE
`select.type`	a string: "all_flexible" – optimal `d` and `lambda` can vary across time points specified by `pos`, "d_fixed" – optimal `d` is fixed across time points specified by `pos` and optimal `lambda` can vary across time points specified by `pos`, "all_fixed" – optimal `d` and `lambda` are fixed across time points specified by `pos`, default = "all_flexible"
`cv.vote.thres`	a scalar between 0 and 1: an edge is kept after cv.vote if and only if it exists in no less than `cv.vote.thres`*`cv.fold` cv folds, default = 0.8
`early.stop.thres`	a scalar: stopping criterion in grid search – grid search stops when the ratio between number of detected edges and number of variables exceeds `early.stop.thres`, default = 5
`epi.abs`	a scalar or a vector of the same length as `d.list`: absolute tolerance in ADMM stopping criterion, default = 1e-4.
`epi.rel`	a scalar or a vector of the same length as `d.list`: relative tolerance in ADMM stopping criterion, default = 1e-2.
`max.step`	an integer: maximum iteration steps in ADMM iteration, default = 500
`detrend`	logic: if TRUE, subtract kernel weighted moving average for each variable in data matrix (i.e., detrending), if FALSE, subtract overall average for each variable in data matrix (i.e., centering), default = TRUE
`fit.corr`	logic: if TRUE, use sample correlation matrix in model fitting, if FALSE, use sample covariance matrix in model fitting, default = TRUE
`h.correct`	logic: if TRUE, apply bandwidth adjustment for validation sets due to sample size difference, default = TRUE
`num.thread`	an integer: number of threads used in parallel computing, default = 1
`print.detail`	logic: if TRUE, print details in model fitting procedure, default = TRUE

This function conducts grid search for optimal h, d and lambda. It is a wrapper of loggle.cv.h and calls loggle.cv.h for each candidate h.

The model fitting method based on pseudo-likelihood (fit.type = "pseudo" or fit.type = "space") is usually less computationally intensive than that based on likelihood (fit.type = "likelihood"), with similar model fitting performance.

select.type = "all_flexible" is for the situation where we expect both the extent of structure smoothness (controlled by d) and the extent of graph sparsity (controlled by lambda) vary across time points. If only the extent of graph sparsity varies across time points, select.type = "d_fixed" should be used. If both of them are expected to be similar across time points, select.type = "all_fixed" should be used.

cv.vote.thres controls the tradeoff between false discovery rate and power in model selection. A large value of cv.vote.thres would decrease false discovery rate but also hurt power.

early.stop.thres is used as an early stopping criterion. With limited sample size, a very complicated model seldom leads to a competitive cv score. Therefore in grid search for lambda, when the estimated model is already large determined by early.stop.thres, we stop fitting models for smaller lambda's (as they often lead to more complicated models). In this case, the output corresponding to the remaining lambda's will be NA.

If no pre-processing has been done to the data matrix X, detrend = TRUE is recommended to detrend each variable in data matrix by subtracting corresponding kernel weighted moving average.

fit.corr = TRUE is recommended such that all the variables are of the same scale. If fit.corr = FALSE is used, the default value of lambda may need to be changed accordingly.

h.correct = TRUE is recommended in calculating kernel smoothed sample covariance matrix for validation sets. This is because bandwidth h should reflect the difference in sample sizes between training and validation sets.

`cv.result.h`	a list of model fitting results from loggle.cv.h for each `h`
`cv.select.result`	results from loggle.cv.select if `return.select = TRUE`

Yang, J. and Peng, J.

Yang, J. & Peng, J. (2018), 'Estimating Time-Varying Graphical Models', arXiv preprint arXiv:1804.03811

loggle for learning time-varying graphical models, loggle.cv.select for model selection based on cross validation results, loggle.cv.vote for learning time-varying graphical models using cv.vote, loggle.cv.h for learning time-varying graphical models via cross validation (with h fixed), loggle.cv.select.h for model selection based on cross validation results (with h fixed), loggle.refit for model refitting based on estimated graphs, example for description of an example dataset of time-varying graphs, example_source for generating the example dataset of time-varying graphs, stockdata for description of stock price dataset of S&P 500 companies, stockdata_source for obtaining stock price dataset of S&P 500 companies.

data(example)  # load example dataset
# data matrix and true precision matrices
X <- example$X
Omega.true <- example$Omega.true
dim(X)  # dimension of data matrix
p <- nrow(X)  # number of variables

# positions of time points to estimate graphs
pos <- round(seq(0.1, 0.9, length=9)*(ncol(X)-1)+1)
K <- length(pos)
# estimate time-varying graphs and conduct model 
# selection via cross-validation
# num.thread can be set as large as number of cores 
# on a multi-core machine (however when p is large, 
# memory overflow should also be taken caution of)
ts <- proc.time()
result <- loggle.cv(X, pos, h.list = c(0.2, 0.25), 
d.list = c(0, 0.05, 0.15, 1), lambda.list 
= c(0.2, 0.25), cv.fold = 3, fit.type = "pseudo", 
cv.vote.thres = 1, num.thread = 1)
te <- proc.time()
sprintf("Time used for loggle.cv: %.2fs", (te-ts)[3])

# optimal values of h, d and lambda, and number of 
# selected edges at each time point
select.result <- result$cv.select.result
print(cbind("time" = seq(0.1, 0.9, length=9),
"h.opt" = rep(select.result$h.opt, K),
"d.opt" = select.result$d.opt,
"lambda.opt" = select.result$lambda.opt,
"edge.num.opt" = select.result$edge.num.opt))

# false discovery rate (FDR) and power based on 
# true precision matrices for selected model
edge.num.opt <- select.result$edge.num.opt
edge.num.true <- sapply(1:K, function(i) 
(sum(Omega.true[[pos[i]]]!=0)-p)/2)
edge.num.overlap <- sapply(1:K, function(i) 
(sum(select.result$adj.mat.opt[[i]]
& Omega.true[[pos[i]]])-p)/2)
perform.matrix <- cbind(
"FDR" = 1 - edge.num.overlap / edge.num.opt,
"power" = edge.num.overlap / edge.num.true)
print(apply(perform.matrix, 2, mean))