screen.pfc: Adaptive Screening of Predictors

Description Usage Arguments Details Value Author(s) References Examples

View source: R/screen.pfc.R

Description

Given a set of p predictors and a response, this function selects all predictors that are statistically related to the response at a specified significance level, using a flexible basis function.

Usage

1
screen.pfc(X, fy, cutoff=0.1)

Arguments

X

Matrix or data frame with n rows of observations and p columns of predictors of continuous type.

fy

Function of y. Basis function to be used to capture the dependency between individual predictors and the response. See bf for detail.

cutoff

The level of significance to be used for the cutoff, by default 0.1.

Details

For each predictor X_j, write the equation

X_j= μ + φ f_y + ε

where f_y is a flexible basis function provided by the user. The basis function is constructed using the function bf. The screening procedure uses a test statistic on the null hypothesis φ=0 against the alternative φ \ne 0. Given the r components of the basis function f_y, the above model is a linear model where X_j is the response and f_y constitutes the predictors. The hypothesis test on φ is essentially an F-test. Specifically, given the data, let \hat{φ} be the ordinary least squares estimator of φ. We consider the usual test statistic

F_j=\frac{n-r-1}{r}.\frac{∑_{i=1}^n [(X_{ji}-\bar{X}_{j.})^2 - (X_{ji}-\bar{X}_{j.} - \hat{φ}_j \mathbf{f}_{y_i})^2]}{∑_{i=1}^n (X_{ji}-\bar{X}_{j.} - \hat{φ}_j \mathbf{f}_{y_i})^2}

where \bar{X}_{j.}=∑_{i=1}^n X_{ji}/n. The statistic F_j follows an F distribution with (r, n-r-1) degrees of freedom. The sample size n is expected to be larger than r.

Value

Return a data frame object with p rows corresponding to the variables with the following columns

F

F statistic for testing the above hypotheses.

P-value

The p-value of the test statistic. The F test has 1 and n-2 degrees of freedom

Index

Index of the variable, as its position j.

Author(s)

Kofi Placid Adragni <kofi@umbc.edu>

References

Adragni, KP and Cook, RD (2008) Discussion on the Sure Independence Screening for Ultrahigh Dimensional Feature Space of Jianqing Fan and Jinchi Lv (2007) Journal of the Royal Statistical Society Series B, 70, Part5, pp1:35

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data(OH)
X <- OH[, -c(1,295)]; y=OH[,295]

# Correlation screening
out <- screen.pfc(X, fy=bf(y, case="poly", degree=1))
head(out)

# Special basis function
out1 <- screen.pfc(X, fy=scale(cbind(y, sqrt(y)), center=TRUE, scale=FALSE))
head(out1)

# Piecewise constant basis with 10 slices
out2 <- screen.pfc(X, fy=bf(y, case="pdisc", degree=0, nslices=10))
head(out2)

ldr documentation built on May 2, 2019, 2:13 p.m.

Related to screen.pfc in ldr...