RF: Adaptive Fence model selection (Restricted Fence)

Description Usage Arguments Details Value Note References Examples

View source: R/methodRF.R

Description

Adaptive Fence model selection (Restricted Fence)

Usage

1
2
3
RF(full, data, groups, B = 100, grid = 101, bandwidth = NA,
  plot = FALSE, method = c("marginal", "conditional"), id = "id",
  cpus = parallel::detectCores())

Arguments

full

formula of full model

data

data

groups

A list of formulas of (full) model in each bins (groups) of variables

B

number of bootstrap sample, parametric for lmer

grid

grid for c

bandwidth

bandwidth for kernel smooth function

plot

Plot object

method

either marginal (GEE) or conditional approach is selected

id

Subject or cluster id variable

cpus

Number of parallel computers

Details

In Jiang et. al (2008), the adaptive c value is chosen from the highest peak in the p* vs. c plot. In Jiang et. al (2009), 95% CI is taken into account while choosing such an adaptive choice of c. In Thuan Nguyen et. al (2014), the adaptive c value is chosen from the first peak. This approach works better in the moderate sample size or weak signal situations. Empirically, the first peak becomes highest peak when sample size increases or signals become stronger

Value

models

list all model candidates in the model space

B

list the number of bootstrap samples that have been used

lack_of_fit_matrix

list a matrix of Qs for all model candidates (in columns). Each row is for each bootstrap sample

Qd_matrix

list a matrix of QM - QM.tilde for all model candidates. Each row is for each bootrap sample

bandwidth

list the value of bandwidth

model_mat

list a matrix of selected models at each c values in grid (in columns). Each row is for each bootstrap sample

freq_mat

list a matrix of coverage probabilities (frequency/smooth_frequency) of each selected models for a given c value (index)

c

list the adaptive choice of c value from which the parsimonious model is selected

sel_model

list the selected (parsimonious) model given the adaptive c value

Note

bandwidth = (cs[2] - cs[1]) * 3. So it's chosen as 3 times grid between two c values.

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 
r =1234; set.seed(r)
n = 100; p=15; rho = 0.6
beta = c(1,1,1,0,1,1,0,1,0,0,1,0,0,0,0)  # non-zero beta 1,2,3,V6,V7,V9,V12
id = rep(1:n,each=3)
V.1 = rep(1,n*3)
I.1 = rep(c(1,-1),each=150)
I.2a = rep(c(0,1,-1),n)
I.2b = rep(c(0,-1,1),n)
x = matrix(rnorm(n*3*11), nrow=n*3, ncol=11)
x = cbind(id,V.1,I.1,I.2a,I.2b,x)
R = diag(3)
for(i in 1:3){
 for(j in 1:3){
   R[i,j] = rho^(abs(i-j))
 }
} 
e=as.vector(t(mvrnorm(n, rep(0, 3), R)))  
y = as.vector(x[,-1]%*%beta) + e
data = data.frame(x,y)
raw = "y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:16) { raw = paste0(raw, "+V", i)}; full = as.formula(raw)
bin1="y ~ V.1 + I.1 + I.2a +I.2b"
for (i in 6:8) { bin1 = paste0(bin1, "+V", i)}; bin1 = as.formula(bin1)
bin2="y ~ V9"
for (i in 10:16){ bin2 = paste0(bin2, "+V", i)}; bin2 = as.formula(bin2)
# May take longer than 30 min since there are two stages in this RF procedure
obj1.RF = RF(full = full, data = data, groups = list(bin1,bin2), method="conditional")
obj1.RF$sel_model
obj2.RF = RF(full = full, data = data, groups = list(bin1,bin2), B=100, method="marginal")
obj2.RF$sel_model

## End(Not run)

fence documentation built on May 1, 2019, 11:32 p.m.

Related to RF in fence...