tbfl: Threshold block fused lasso (TBFL) algorithm for change point...

Description Usage Arguments Value Author(s) Examples

View source: R/LinearDetect-package.R

Description

Perform the threshold block fused lasso (TBFL) algorithm to detect the structural breaks in large scale high-dimensional non-stationary linear regression models.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
tbfl(
  method,
  data_y,
  data_x = NULL,
  lambda.1.cv = NULL,
  lambda.2.cv = NULL,
  q = 1,
  max.iteration = 100,
  tol = 10^(-2),
  block.size = NULL,
  blocks = NULL,
  refit = FALSE,
  fixed_index = NULL,
  HBIC = FALSE,
  gamma.val = NULL,
  optimal.block = TRUE,
  optimal.gamma.val = 1.5,
  block.range = NULL
)

Arguments

method

method name for the model: Constant: Mean-shift Model; MvLR: Multivariate Linear Regression; MLR: Multiple Linear Regression; VAR: Vector autoregression; GGM: Gaussian graphical model

data_y

input data matrix (response), with each column representing the time series component

data_x

input data matrix (predictor), with each column representing the time series component

lambda.1.cv

tuning parmaeter lambda_1 for fused lasso

lambda.2.cv

tuning parmaeter lambda_2 for fused lasso

q

the AR order

max.iteration

max number of iteration for the fused lasso

tol

tolerance for the fused lasso

block.size

the block size

blocks

the blocks

refit

logical; if TRUE, refit the model, if FALSE, use BIC to find a thresholding value and then output the parameter estimates without refitting. Default is FALSE.

fixed_index

index for linear regression model with only partial compoenents change.

HBIC

logical; if TRUE, use high-dimensional BIC, if FALSE, use orginal BIC. Default is FALSE.

gamma.val

hyperparameter for HBIC, if HBIC == TRUE.

optimal.block

logical; if TRUE, grid search to find optimal block size, if FALSE, directly use the default block size. Default is TRUE.

optimal.gamma.val

hyperparameter for optimal block size, if optimal.blocks == TRUE. Default is 1.5.

block.range

the search domain for optimal block size.

Value

A list object, which contains the followings

cp.first

a set of selected break point after the first block fused lasso step

cp.final

a set of selected break point after the final exhaustive search step

beta.hat.list

a list of estimated parameter coefficient matrices for each stationary segementation

beta.est

a list of estimated parameter coefficient matrices for each block

beta.final

a list of estimated parameter coefficient matrices for each stationary segementation, using BIC thresholding or refitting the model.

beta.full.final

For GGM only. A list of p \times p matrices for each stationary segementation. The off-diagonal entries are same as the beta.final.

jumps

The change (jump) of the values in estimated parameter coefficient matrix.

bn.optimal

The optimal block size.

bn.range

The values of block size in grid search.

HBIC.full

The HBIC values.

pts.full

The selected change points for each block size.

Author(s)

Yue Bai, baiyue@ufl.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#### constant model
TT <- 10^3; # number of observations/samples
p.y <- 50; # dimension of observed Y
brk <- c(floor(TT/3),floor(2*TT/3), TT+1)
m <- length(brk)
d <- 5 #number of non-zero coefficient
### generate coefficient
constant.full <- matrix(0, p.y, m)
set.seed(1)
constant.full[sample(1:p.y, d, replace = FALSE), 1] <- runif(d, -1, -0.5);
constant.full[sample(1:p.y, d, replace = FALSE), 2] <- runif(d, 0.5, 1);
constant.full[sample(1:p.y, d, replace = FALSE), 3] <- runif(d, -1, -0.5);
e.sigma <- as.matrix(1*diag(p.y))
try <- constant.sim.break(nobs = TT, cnst = constant.full, sigma = e.sigma, brk = brk)
data_y <- try$series_y; data_y <- as.matrix(data_y, ncol = p.y)
### Fit the model
method <- c("Constant")
temp <- tbfl(method, data_y, block.size = 40, optimal.block = FALSE) #use a single block size
temp$cp.final
temp$beta.final
temp <- tbfl(method, data_y) # using optimal block size




#### multiple linear regression
TT <- 2*10^3; # number of observations/samples
p.y <- 1; # dimension of observed Y
p.x <- 20
brk <- c(floor(TT/4), floor(2*TT/4), floor(3*TT/4), TT+1)
m <- length(brk)
d <- 15 #number of non-zero coefficient
###generate coefficient beta
beta.full <- matrix(0, p.y, p.x*m)
set.seed(1)
aa <- c(-3, 5, -3, 3)
for(i in 1:m){beta.full[1, (i-1)*p.x+sample(1:p.x, d, replace = FALSE)] <- aa[i] + runif(d, -1, 1);}
e.sigma <- as.matrix(1*diag(p.y))
try <- lm.sim.break(nobs = TT, px = p.x, phi = beta.full, sigma = e.sigma, sigma_x = 1, brk = brk)
data_y <- try$series_y; data_y <- as.matrix(data_y, ncol = p.y)
data_x <- try$series_x; data_x <- as.matrix(data_x)
### Fit the model
method <- c("MLR")
temp <- tbfl(method, data_y, data_x)
temp$cp.final #change points
temp$beta.final #final estimated parameters (after BIC threshold)
temp_refit <- tbfl(method, data_y, data_x, refit = TRUE)
temp_refit$beta.final #final estimated parameters (refitting the model)




#### Gaussian Graphical model
TT <- 3*10^3; # number of observations/samples
p.x <- 20 # dimension of obsrved X
# TRUE BREAK POINTS WITH T+1 AS THE LAST ELEMENT
brk <- c(floor(TT/3), floor(2*TT/3), TT+1)
m <- length(brk)
###generate precision matrix and covariance matrix
eta = 0.1
d <- ceiling(p.x*eta)
sigma.full <- matrix(0, p.x, p.x*m)
omega.full <- matrix(0, p.x, p.x*m)
aa <- 1/d
for(i in 1:m){
if(i%%2==1){
ajmatrix <- matrix(0, p.x, p.x)
for(j in 1:(floor(p.x/5)) ){
ajmatrix[ ((j-1)*5+1): (5*j), ((j-1)*5+1): (5*j)] <- 1
}
}
if(i%%2==0){
ajmatrix <- matrix(0, p.x, p.x)
for(j in 1:(floor(p.x/10)) ){
ajmatrix[ seq(((j-1)*10+1), (10*j), 2), seq(((j-1)*10+1), (10*j), 2)] <- 1
ajmatrix[ seq(((j-1)*10+2), (10*j), 2), seq(((j-1)*10+2), (10*j), 2)] <- 1
}
}
theta <- aa* ajmatrix
# force it to be positive definite
if(min(eigen(theta)$values) <= 0){
print('add noise')
theta = theta - (min(eigen(theta)$values)-0.05) * diag(p.x)
}
sigma.full[, ((i-1)*p.x+1):(i*p.x)]  <- as.matrix(solve(theta))
omega.full[, ((i-1)*p.x+1):(i*p.x)]  <- as.matrix(theta)
}
# simulate data
try <- ggm.sim.break(nobs = TT, px = p.x, sigma = sigma.full, brk = brk)
data_y <- try$series_x; data_y <- as.matrix(data_y)
### Fit the model
method <- c("GGM")
#use a single block size
temp <- tbfl(method,data_y = data_y,block.size = 80,optimal.block = FALSE)
temp$cp.final #change points
temp$beta.final

LinearDetect documentation built on March 22, 2021, 9:06 a.m.