Description Usage Arguments Details Value Author(s) References See Also Examples
This function is to efficiently learn time-varying graphical models for a given set of tuning parameters. Different from loggle, cv.vote is also applied in estimating time-varying graphs.
1 2 3 4 5 6 |
X |
a p by N data matrix containing observations on a time grid ranging from 0 to 1: p – number of variables, N – number of time points. The nominal time for the kth time point is (k-1)/(N-1) |
pos |
a vector constitutes a subset of {1, 2, ..., N}: indices of time points where graphs are estimated, default = 1:N |
h |
a scalar between 0 and 1: bandwidth in kernel smoothed sample covariance/correlation matrix, default = 0.8*N^(-1/5) |
d |
a scalar or a vector of the same length as |
lambda |
a scalar or a vector of the same length as |
cv.fold |
a scalar: number of cross-validation folds, default = 5 |
fit.type |
a string: "likelihood" – likelihood estimation, "pseudo" – pseudo likelihood estimation, or "space" – sparse partial correlation estimation, default = "pseudo" |
refit |
logic: if TRUE, conduct model refitting given learned graph structures, default = TRUE |
cv.vote.thres |
a scalar between 0 and 1: an edge is kept after cv.vote if and only if it exists in no less than |
epi.abs |
a scalar: absolute tolerance in ADMM stopping criterion, default = 1e-5 |
epi.rel |
a scalar: relative tolerance in ADMM stopping criterion, default = 1e-3 |
max.step |
an integer: maximum iteration steps in ADMM iteration, default = 500 |
detrend |
logic: if TRUE, subtract kernel weighted moving average for each variable in data matrix (i.e., detrending), if FALSE, subtract overall average for each variable in data matrix (i.e., centering), default = TRUE |
fit.corr |
logic: if TRUE, use sample correlation matrix in model fitting, if FALSE, use sample covariance matrix in model fitting, default = TRUE |
num.thread |
an integer: number of threads used in parallel computing, default = 1 |
print.detail |
logic: if TRUE, print details in model fitting procedure, default = TRUE |
The idea of cross-validation is implemented in this function, where loggle is applied on each cross-validation fold to get fold-wise estimated time-varying graphs, and cv.vote is then applied across cross-validation folds to get the final version of estimated time-varying graphs.
The model fitting method based on pseudo-likelihood (fit.type = "pseudo"
or fit.type = "space"
) is usually less computationally intensive than that based on likelihood (fit.type = "likelihood"
), with similar model fitting performance.
cv.vote.thres
controls the tradeoff between false discovery rate and power. A large value of cv.vote.thres
would decrease false discovery rate but also hurt power.
If no pre-processing has been done to the data matrix X
, detrend = TRUE
is recommended to detrend each variable in data matrix by subtracting corresponding kernel weighted moving average.
fit.corr = TRUE
is recommended such that all the variables are of the same scale. If fit.corr = FALSE
is used, the default value of lambda
may need to be changed accordingly.
result.fold |
a list of model fitting results from loggle for each cv fold |
Omega |
a list of estimated precision matrices at time points specified by |
edge.num |
a vector of numbers of edges at time points specified by |
edge |
a list of edges at time points specified by |
Yang, J. and Peng, J.
Yang, J. & Peng, J. (2018), 'Estimating Time-Varying Graphical Models', arXiv preprint arXiv:1804.03811
loggle for learning time-varying graphical models, loggle.cv for learning time-varying graphical models via cross validation, loggle.cv.select for model selection based on cross validation results.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | data(example) # load example dataset
# data matrix and true precision matrices
X <- example$X
Omega.true <- example$Omega.true
dim(X) # dimension of data matrix
p <- nrow(X) # number of variables
# positions of time points to estimate graphs
pos <- round(seq(0.1, 0.9, length=9)*(ncol(X)-1)+1)
K <- length(pos)
# estimate time-varying graphs
# num.thread can be set as large as number of cores
# on a multi-core machine (however when p is large,
# memory overflow should also be taken caution of)
ts <- proc.time()
result <- loggle.cv.vote(X, pos, h = 0.1, d = 0.15,
lambda = 0.25, cv.fold = 3, fit.type = "pseudo",
refit = TRUE, cv.vote.thres = 0.8, num.thread = 1)
te <- proc.time()
sprintf("Time used for loggle.cv.vote: %.2fs", (te-ts)[3])
# number of edges at each time point
print(cbind("time" = seq(0.1, 0.9, length=9),
"edge.num" = result$edge.num))
# graph at each time point
library(igraph)
par(mfrow = c(3, 3))
for(k in 1:length(pos)) {
adj.matrix <- result$Omega[[k]] != 0
net <- graph.adjacency(adj.matrix, mode =
"undirected", diag = FALSE)
set.seed(0)
plot(net, vertex.size = 10, vertex.color =
"lightblue", vertex.label = NA, edge.color =
"black", layout = layout.circle)
title(main = paste("t =",
round(pos[k]/(ncol(X)-1), 2)), cex.main = 0.8)
}
# false discovery rate (FDR) and power based on
# true precision matrices
edge.num.true <- sapply(1:K, function(i)
(sum(Omega.true[[pos[i]]]!=0)-p)/2)
edge.num.overlap <- sapply(1:K, function(i)
(sum(result$Omega[[i]] & Omega.true[[pos[i]]])-p)/2)
perform.matrix <- cbind(
"FDR" = 1 - edge.num.overlap / result$edge.num,
"power" = edge.num.overlap / edge.num.true)
print(apply(perform.matrix, 2, mean))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.