Description Usage Arguments Details Value References See Also Examples
Given a covariate matrix and output vector, this function first adjusts the covariates for underlying factors and then performs model selection.
1 2 3 4 
X 
an n x p covariate matrix with each row being a sample. Must have same number of rows as the size of 
Y 
a size n outcome vector. 
loss 
a character string specifying the loss function to be minimized. Must be one of "scad" (default) "mcp" or "lasso". You can just specify the initial letter. 
robust 
a boolean, specifying whether or not to use robust estimators for mean and variance. Default is TRUE. 
cv 
a boolean, specifying whether or not to run crossvalidation for the tuning parameter. Default is FALSE. Only used if 
tau 

lin.reg 
a boolean, specifying whether or not to assume that we have a linear regression model (TRUE) or a logit model (FALSE) structure. Default is TRUE. 
K.factors 
number of factors to be estimated. Otherwise estimated internally. K>0. 
max.iter 
maximum number of iterations across the regularization path. Default is 10000. 
nfolds 
the number of crossvalidation folds. Default is ceiling(samplesize/3). 
eps 
Convergence threshhold for model fitting using 
verbose 
a boolean specifying whether to print runtime updates to the console. Default is TRUE. 
For formula of how the covariates are adjusted for latent factors, see Section 3.2 in Fan et al.(2017).
The tuning parameter = tau * sigma * optimal rate
where optimal rate
is the optimal rate for the tuning parameter. For details, see Fan et al.(2017). sigma
is the standard deviation of the data.
ncvreg
is used to fit the model after decorrelation. This package may output its own warnings about failures to converge and model saturation.
A list with the following items
model.size 
the size of the model 
beta.chosen 
the indices of the covariates chosen in the model 
coef.chosen 
the coefficients of the chosen covariates 
X.residual 
the residual covariate matrix after adjusting for factors 
nfactors 
number of (estimated) factors 
n 
number of observations 
p 
number of dimensions 
robust 
whether robust parameters were used 
loss 
loss function used 
#' @details Number of rows and columns of the covariate matrix must be at least 4 in order to be able to calculate latent factors.
Fan J., Ke Y., Wang K., "Decorrelation of Covariates for High Dimensional Sparse Regression." https://arxiv.org/abs/1612.08490
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52  ##linear regression
set.seed(100)
P = 200 #dimension
N = 50 #samples
K = 3 #nfactors
Q = 3 #model size
Lambda = matrix(rnorm(P*K, 0,1), P,K)
F = matrix(rnorm(N*K, 0,1), N,K)
U = matrix(rnorm(P*N, 0,1), P,N)
X = Lambda%*%t(F)+U
X = t(X)
beta_1 = rep(5,Q)
beta = c(beta_1, rep(0,PQ))
eps = rt(N, 2.5)
Y = X%*%beta+eps
##with default options
output = farm.select(X,Y) #robust, no crossvalidation
output$beta.chosen #variables selected
output$coef.chosen #coefficients of selected variables
#examples of other robustification options
output = farm.select(X,Y,robust = FALSE) #nonrobust
output = farm.select(X,Y, tau = 3) #robust, no crossvalidation, specified tau
#output = farm.select(X,Y, cv= TRUE) #robust, crossvalidation: LONG RUNNING!
##changing the loss function and inputting factors
output = farm.select(X, Y,loss = "mcp", K.factors = 4)
##use a logistic regression model, a larger sample size is desired.
## Not run:
set.seed(100)
P = 400 #dimension
N = 300 #samples
K = 3 #nfactors
Q = 3 #model size
Lambda = matrix(rnorm(P*K, 0,1), P,K)
F = matrix(rnorm(N*K, 0,1), N,K)
U = matrix(rnorm(P*N, 0,1), P,N)
X = Lambda%*%t(F)+U
X = t(X)
beta_1 = rep(5, Q)
beta = c(beta_1, rep(0,PQ))
eps = rnorm(N)
Prob = 1/(1+exp(X%*%beta))
Y = rbinom(N, 1, Prob)
output = farm.select(X,Y, lin.reg=FALSE, eps=1e3)
output$beta.chosen
output$coef.chosen
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.