np.reg.test: Nonparametric Tests of Regression Coefficients

View source: R/np.reg.test.R

np.reg.testR Documentation

Nonparametric Tests of Regression Coefficients

Description

Assuming a linear model of the form

Y = \alpha + X \beta + \epsilon

or

Y = \alpha + X \beta + Z \gamma + \epsilon

this function implements permutation tests of H_0: \beta = \beta_0 where \beta_0 is the user-specified null vector.

Usage

np.reg.test(x, y, z = NULL, method = NULL,
            beta = NULL, homosced = FALSE, lambda = 0, 
            R = 9999, parallel = FALSE, cl = NULL,
            perm.dist = TRUE, na.rm = TRUE)

Arguments

x

Matrix of predictor variables (n by p).

y

Response vector or matrix (n by m).

z

Optional matrix of nuisance variables (n by q).

method

Permutation method. See Details.

beta

Null hypothesis value for \beta (p by m). Defaults to matrix of zeros.

homosced

Are the \epsilon terms homoscedastic? If FALSE (default), a robust Wald test statistic is used. Otherwise the classic F test statistic is used.

lambda

Scalar or vector of ridge parameter(s). Defaults to vector of zeros.

R

Number of resamples for the permutation test (positive integer).

parallel

Logical indicating if the parallel package should be used for parallel computing (of the permutation distribution). Defaults to FALSE, which implements sequential computing.

cl

Cluster for parallel computing, which is used when parallel = TRUE. Note that if parallel = TRUE and cl = NULL, then the cluster is defined as makeCluster(2L) to use two cores. To make use of all available cores, use the code cl = makeCluster(detectCores()).

perm.dist

Logical indicating if the permutation distribution should be returned.

na.rm

If TRUE (default), the arguments x and y (and z if provided) are passed to the na.omit function to remove cases with missing data.

Details

With no nuisance variables in the model (i.e., z = NULL), there are three possible options for the method argument:

Method Model
perm P Y = \alpha + X \beta + \epsilon
flip S Y = \alpha + X \beta + \epsilon
both P S Y = \alpha + X \beta + \epsilon

where P is a permutation matrix and S is a sign-flipping matrix.

With nuisance variables in the model, there are eight possible options for the method argument:

Method Name Model
HJ Huh-Jhun P Q' R_z Y = \alpha + Q' R_z X \beta + \epsilon
KC Kennedy-Cade P R_z Y = \alpha + R_z X \beta + \epsilon
SW Still-White P R_z Y = \alpha + X \beta + \epsilon
TB ter Braak (P R_m + H_m) Y = \alpha + X \beta + Z \gamma + \epsilon
FL Freedman-Lane (P R_z + H_z) Y = \alpha + X \beta + Z \gamma + \epsilon
MA Manly P Y = \alpha + X \beta + Z \gamma + \epsilon
OS O'Gorman-Smith Y = \alpha + P R_z X \beta + Z \gamma + \epsilon
DS Draper-Stoneman Y = \alpha + P X \beta + Z \gamma + \epsilon

where P is permutation matrix and Q is defined as R_z = Q Q' with Q'Q = I.

Note that H_z is the hat matrix for the nuisance variable design matrix, and R_z = I - H_z is the corresponding residual forming matrix. Similarly, H_m and R_m are the hat and residual forming matrices for the full model including the predictor and nuisance variables.

Value

statistic

Test statistic value.

p.value

p-value for testing H_0: \beta = \beta_0.

perm.dist

Permutation distribution of statistic.

method

Permutation method.

null.value

Null hypothesis value for \beta.

homosced

Homoscedastic errors?

R

Number of resamples.

exact

Exact permutation test? See Note.

coefficients

Least squares estimates of \alpha, \beta, and \gamma (if applicable).

univariate

Univariate test statistic value for j-th variable (for multivariate inputs).

adj.p.value

Adjusted p-value for testing significance of j-th variable (for multivariate inputs).

Multivariate Tests

If the input y is a matrix with m > 1 columns, the multivariate test statistic is defined as statistic = max(univariate) given that the univariate test statistics are non-negative.

The global null hypothesis (across all m variables) is tested by comparing the observed statistic to the permutation distribution perm.dist. This produces the p.value for testing the global null hypothesis.

The local null hypothesis (separately for each variable) is tested by comparing the univariate test statistic to perm.dist. This produces the adjusted p-values (adj.p.values), which control the familywise Type I error rate across the m tests.

Note

If method = "flip", the permutation test will be exact when the requested number of resamples R is greater than 2^n minus one. In this case, the permutation distribution perm.dist contains all 2^n possible values of the test statistic.

If method = "both", the permutation test will be exact when the requested number of resamples R is greater than factorial(n) * (2^n) minus one. In this case, the permutation distribution perm.dist contains all factorial(n) * (2^n) possible values of the test statistic.

If method = "HJ", the permutation test will be exact when the requested number of resamples R is greater than factorial(n-q-1) minus one. In this case, the permutation distribution perm.dist contains all factorial(n-q-1) possible values of the test statistic.

Otherwise the permutation test will be exact when the requested number of resamples R is greater than factorial(n) minus one. In this case, the permutation distribution perm.dist contains all factorial(n) possible values of the test statistic.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

DiCiccio, C. J., & Romano, J. P. (2017). Robust permutation tests for correlation and regression coefficients. Journal of the American Statistical Association, 112(519), 1211-1220. doi: 10.1080/01621459.2016.1202117

Draper, N. R., & Stoneman, D. M. (1966). Testing for the inclusion of variables in linear regression by a randomisation technique. Technometrics, 8(4), 695-699. doi: 10.2307/1266641

Freedman, D., & Lane, D. (1983). A nonstochastic interpretation of reported significance levels. Journal of Business and Economic Statistics, 1(4), 292-298. doi: 10.2307/1391660

Helwig, N. E. (2019a). Statistical nonparametric mapping: Multivariate permutation tests for location, correlation, and regression problems in neuroimaging. WIREs Computational Statistics, 11(2), e1457. doi: 10.1002/wics.1457

Helwig, N. E. (2019b). Robust nonparametric tests of general linear model coefficients: A comparison of permutation methods and test statistics. NeuroImage, 201, 116030. doi: 10.1016/j.neuroimage.2019.116030

Huh, M.-H., & Jhun, M. (2001). Random permutation testing in multiple linear regression. Communications in Statistics - Theory and Methods, 30(10), 2023-2032. doi: 10.1081/STA-100106060

Kennedy, P. E., & Cade, B. S. (1996). Randomization tests for multiple regression. Communications in Statistics - Simulation and Computation, 25(4), 923-936. doi: 10.1080/03610919608813350

Manly, B. (1986). Randomization and regression methods for testing for associations with geographical, environmental and biological distances between populations. Researches on Population Ecology, 28(2), 201-218. doi: 10.1007/BF02515450

Nichols, T. E., Ridgway, G. R., Webster, M. G., & Smith, S. M. (2008). GLM permutation: nonparametric inference for arbitrary general linear models. NeuroImage, 41(S1), S72.

O'Gorman, T. W. (2005). The performance of randomization tests that use permutations of independent variables. Communications in Statistics - Simulation and Computation, 34(4), 895-908. doi: 10.1080/03610910500308230

Still, A. W., & White, A. P. (1981). The approximate randomization test as an alternative to the F test in analysis of variance. British Journal of Mathematical and Statistical Psychology, 34(2), 243-252. doi: 10.1111/j.2044-8317.1981.tb00634.x

ter Braak, C. J. F. (1992). Permutation versus bootstrap significance tests in multiple regression and ANOVA. In K. H. J\"ockel, G. Rothe, & W. Sendler (Eds.), Bootstrapping and related techniques. lecture notes in economics and mathematical systems, vol 376 (pp. 79-86). Springer.

White, H. (1980). A heteroscedasticity-consistent covariance matrix and a direct test for heteroscedasticity. Econometrica, 48(4), 817-838. doi: 10.2307/1912934

Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014). Permutation inference for the general linear model. NeuroImage, 92, 381-397. doi: 10.1016/j.neuroimage.2014.01.060

See Also

plot.np.reg.test S3 plotting method for visualizing the results

Examples


######******######   UNIVARIATE   ######******######

###***###   TEST ALL COEFFICIENTS   ###***###

# generate data
set.seed(1)
n <- 10
x <- cbind(rnorm(n), rnorm(n))
y <- rnorm(n)

# Wald test (method = "perm")
set.seed(0)
np.reg.test(x, y)

# F test (method = "perm")
set.seed(0)
np.reg.test(x, y, homosced = TRUE)


###***###   TEST SUBSET OF COEFFICIENTS   ###***###

# generate data
set.seed(1)
n <- 10
x <- rnorm(n)
z <- rnorm(n)
y <- 3 + 2 * z + rnorm(n)

# Wald test (method = "HJ")
set.seed(0)
np.reg.test(x, y, z)

# F test (method = "HJ")
set.seed(0)
np.reg.test(x, y, z, homosced = TRUE)


## Not run: 

######******######   MULTIVARIATE   ######******######

###***###   TEST ALL COEFFICIENTS   ###***###

# generate data
set.seed(1)
n <- 10
x <- cbind(rnorm(n), rnorm(n))
y <- matrix(rnorm(n * 3), nrow = n, ncol = 3)

# multivariate Wald test (method = "perm")
set.seed(0)
np.reg.test(x, y)

# multivariate F test (method = "perm")
set.seed(0)
np.reg.test(x, y, homosced = TRUE)


###***###   TEST SUBSET OF COEFFICIENTS   ###***###

# generate data
set.seed(1)
n <- 10
x <- rnorm(n)
z <- rnorm(n)
y <- cbind(1 + 3 * z + rnorm(n),
           2 + 2 * z + rnorm(n),
           3 + 1 * z + rnorm(n))
           
# multivariate Wald test (method = "HJ")
set.seed(0)
np.reg.test(x, y, z)

# multivariate F test (method = "HJ")
set.seed(0)
np.reg.test(x, y, z, homosced = TRUE)


## End(Not run)


nptest documentation built on April 15, 2023, 1:08 a.m.