est.confounder.num: Estimate the number of confounders

Description Usage Arguments Value Functions References Examples

View source: R/cate.R

Description

Estimate the number of confounders

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
est.confounder.num(
  formula,
  X.data = NULL,
  Y,
  method = c("bcv", "ed"),
  rmax = 20,
  nRepeat = 20,
  bcv.plot = TRUE,
  log = ""
)

est.factor.num(
  Y,
  method = c("bcv", "ed"),
  rmax = 20,
  nRepeat = 12,
  bcv.plot = TRUE,
  log = ""
)

Arguments

formula

a formula indicating the known covariates including both primary variables and nuisance variables, which are seperated by |. The variables before | are primary variables and the variables after | are nuisance variables. It's OK if there is no nuisance variables, then | is not needed and formula becomes a typical formula with all the covariates considered primary. When there is confusion about where the intercept should be put, cate will include it in X.nuis.

X.data

the data frame used for formula

Y

outcome, n*p matrix

method

method to estimate the number of factors. There are currently two choices, "ed" is the eigenvalue difference method proposed by Onatski (2010) and "bcv" is the bi-cross-validation method proposed by Owen and Wang (2015). "bcv" tends to estimate more weak factors and takes longer time

rmax

the maximum number of factors to consider. If the estimated number of factors is rmax, then users are encouraged to increase rmax and run again. Default is 20.

nRepeat

the number of repeats of bi-cross-validation. A larger nRepeat will result in a more accurate estimate of the bcv error, but will need longer time to run.

bcv.plot

whether to plot the relative bcv error versus the number of estimated ranks. The relative bcv error is the entrywise mean square error devided by the average of the estimated noise variance.

log

if log = "y", then the y-axis of the bcv plot is in log scale.

Value

if method is "ed", then return the estimated number of confounders/factors. If method is "bcv", then return the a list of objects

r

estimated number of confounders/factors

errors

the relative bcv errors of length 1 + rmax

Functions

References

A. B. Owen and J. Wang (2015), Bi-cross-validation for factor analysis. Statistical Science, 31(1), 119–139.

A. Onatski (2010), Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics 92(4).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## example for est.confounder.num
data <- gen.sim.data(n = 50, p = 50, r = 5)
X.data <- data.frame(X1 = data$X1)
est.confounder.num(~ X1 | 1, X.data, data$Y, method = "ed")
est.confounder.num(~ X1 | 1, X.data, data$Y, method = "bcv")

## example for est.factor.num
n <- 50
p <- 100
r <- 5
Z <- matrix(rnorm(n * r), n, r)
Gamma <- matrix(rnorm(p * r), p, r)
Y <- Z %*% t(Gamma) + rnorm(n * p)

est.factor.num(Y, method = "ed")
est.factor.num(Y, method = "bcv")

cate documentation built on July 2, 2020, 4:08 a.m.