Description Usage Arguments Details Value References See Also Examples

Can test for the significance of (potentially large) groups of predictors and
the presence of nonlinearity or heteroscedasticity in the context of both low
and high-dimensional linear models. Outputs a p-value. Also allows for the
calibration of arbitrary goodness of fit tests via specification of
`RPfunction`

.

1 2 3 4 5 |

`x` |
Input matrix with |

`y` |
Response vector. |

`resid_type` |
Type of residuals used for the test (see details below).
Use |

`test` |
Type of departure from the linear model to test for (see details
below). Ignored if |

`x_alt` |
If |

`RPfunction` |
A residual prediction (RP) function that must permit
calling as |

`B` |
The number of bootstrap samples to use - note the p-value produced will always be at least 1/B. |

`rand_gen` |
A function to generate the simulated errors up to an unknown
scale factor. It must permit calling as |

`noise_matrix` |
An optional matrix whose columns are the simulated errors to use.
Note that |

`mc.cores` |
The number of cores to use. Will always be 1 in Windows. |

`nfolds` |
Number of folds to use when performing cross-validation to
obtain |

`nperms` |
Number of permutations of the data for which |

`beta_est` |
An optional user-supplied estimate. |

`resid_only` |
If |

`output_all` |
In addition to the p-value, gives further output (see Value below). |

`verbose` |
Whether to print addition information. |

The function works by first computing residuals from a regression of
y on x. Next `B`

sets of errors generated through `rand_gen`

are
added to a signal derived from `beta_est`

and aritificial residuals
are computed. The option `resid_only=TRUE`

then outputs these
residuals along with the original residuals, scaled to have l_2-norm
squared equal to `nobs`

. The residuals in question are OLS residuals
when `resid_type=OLS`

(case a - for use when the null hypothesis is
low-dimensional so the number of columns of `x`

is smaller than
`nobs-1`

), and square-root / scaled Lasso residuals otherwise (case
b). The options for `test`

then apply different functions to the
residuals as described below.

`nonlin`

In case (a), the test statistic is the RSS (residual sum of squares) of a

`randomForest`

fit from regressing the residuals on to`x`

; case (b) is similar but the OOB error is used and the regression is carried out on the equicorrelation set rather than all of`x`

.`group`

`x_alt`

is first residualised with respect to`x`

by (a) OLS or (b)`sparse_proj`

. Then the RSS from Lasso fits from regressions of the residuals on to`x_alt`

are used.`hetero`

Uses the RSS from Lasso fits from regressions of the squared residuals to the equicorrelation set (b) or all of

`x`

(a).

When `resid_only=FALSE`

and `output_all=FALSE`

, the output
is a single p-value. Otherwise, a list with some of the following
components is returned (`resid_only=FALSE`

causes the last two
components to be omitted):

`p-value`

p-value

`beta_est`

estimated vector of regression coefficients

`beta_est`

`sigma_est`

set to 1 when

`resid_type=OLS`

; otherwise the normalised root-RSS derived from`beta_est`

used in generated the simulated errors`resid`

scaled residuals

`resid_sim`

simulated scaled residuals

`test`

the test statistic(s) - may be a vector if multiple RP functions are being used such as when

`test=group`

`test_sim`

a list of simulated test statistics

Shah, R. D., Buhlmann, P. (2016) *Goodness of fit tests for
high-dimensional linear models* http://arxiv.org/abs/1511.03334

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
# Testing for nonlinearity
set.seed(1)
x <- scale(matrix(runif(100*200), 100, 200))
y <- x[, 1] + x[, 1]^4 + rnorm(nrow(x))
out <- RPtest(x, y, test="nonlin", B=9L, nperms=2, resid_type = "Lasso")
# Testing significance of a group
y <- x[, 1:5] %*% rep(1, 5) + x[, 151] + rnorm(nrow(x))
(out <- RPtest(x[, 1:150], y, test="group", x_alt = x[, 151:200], B=9L, nperms=1))
# Testing for heteroscedasticity
x <- scale(matrix(runif(250*100), 250, 100))
hetero_sig <- x[, 1] + x[, 2]
var_vec <- hetero_sig - min(hetero_sig) + 0.01
var_vec <- var_vec / mean(var_vec)
sd_vec <- sqrt(var_vec)
y <- x[, 1:5] %*% rep(1, 5) + sd_vec*rnorm(nrow(x))
(out <- RPtest(x, y, test="hetero", B=9L, nperms=1))
``` |

```
[1] 0.1
[1] 0.1
```

RPtests documentation built on May 29, 2017, 9:06 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.