Description Usage Arguments Details Value Note Author(s) References See Also Examples
Tests a low-dimensional null hypothesis against a potentially high-dimensional alternative in regression models (linear regression, logistic regression, poisson regression, Cox proportional hazards model).
1 2 3 4 |
response |
The response vector of the regression model. May be
supplied as a vector or as a |
alternative |
The part of the design matrix corresponding to
the alternative hypothesis. The covariates of the null model do
not have to be supplied again here. May be given as a half
|
null |
The part of the design matrix corresponding to the null hypothesis. May be given as a design matrix or as a half |
).
data |
Only used when |
test.value |
An optional vector regression coefficients to test. The default is to test the null hypothesis that all regression coefficients of the covariates of the alternative are zero. The |
model |
The type of regression model to be tested. If omitted, the function will try to determine the model from the class and values of the |
levels |
Only used if response is |
directional |
If set to |
standardize |
If set to |
permutations |
The number of permutations to use. The default, |
subsets |
Optional argument that can be used to test one or more subsets of the covariates in |
weights |
Optional argument that can be used to give certain covariates in |
alias |
Optional second label for each test. Should be a vector of the same length as |
x |
If |
trace |
If |
The Global Test tests a low-dimensional null hypothesis against a (potentially) high-dimensional alternative, using the locally most powerful test of Goeman et al (2006). In this regression model implementation, it tests the null hypothesis response ~ null
, that the covariates in alternative
are not associated with the response, against the alternative model response ~ null + alternative
that they are.
The test has a wide range of applications. In gene set testing in microarray data analysis alternative
may be a matrix of gene expression measurements, and the aim is to find which of a collection of predefined subsets
of the genes (e.g. Gene Ontology terms or KEGG pathways) is most associated with the response
. In penalized regression or other machine learning techniques, alternative
may be a collection of predictor variables that may be used to predict a response
, and the test may function as a useful pre-test to see if training the classifier is worthwhile. In goodness-of-fit testing, null
may be a model with linear terms fitted to the response
, and alternative
may be a large collection of non-linear terms. The test may be used in this case to test the fit of the null model with linear terms against a non-linear alternative.
See the vignette for extensive examples of these applications.
The function returns an object of class gt.object
. Several operations and diagnostic plots can be made from this object. See also Diagnostic plots.
If null
is supplied as a formula
object, an intercept is automatically included. As a consequence gt(Y, X, Z)
will usually give a different result from gt(Y, X, ~Z)
. The first call is equivalent to gt(Y, X, ~0+Z)
, whereas the second call is equivalent to gt(Y, X, cbind(1,Z))
.
P-values from the asymptotic distribution are accurate to at least two decimal places up to a value of around 1e-12
. Lower p-values are numerically less reliable.
Missing values are allowed in the alternative
matrix only. Missing values are imputed conservatively (i.e. under the null hypothesis). Covariates with many missing values get reduced variance and therefore automatically carry less weight in the test result.
Jelle Goeman: j.j.goeman@lumc.nl; Jan Oosting
General theory and properties of the global test are described in
Goeman, Van de Geer and Van Houwelingen (2006) Journal of the Royal Statistical Society, Series B 68 (3) 477-493.
For references related to applications of the test, see the vignette GlobalTest.pdf included with this package.
Diagnostic plots: covariates
, subjects
.
The gt.object
function and useful functions associated with that object.
Many more examples in the vignette!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | # Simple examples with random data here
# Real data examples in the Vignette
# Random data: covariates A,B,C are correlated with Y
set.seed(1)
Y <- rnorm(20)
X <- matrix(rnorm(200), 20, 10)
X[,1:3] <- X[,1:3] + Y
colnames(X) <- LETTERS[1:10]
# Compare the global test with the F-test
gt(Y, X)
anova(lm(Y~X))
# Using formula input
res <- gt(Y, ~A+B, null=~C+E, data=data.frame(X))
summary(res)
# Beware: null models with and without intercept
Z <- rnorm(20)
summary(gt(Y, X, null=~Z))
summary(gt(Y, X, null=Z))
# Logistic regression
gt(Y>0, X)
# Subsets and weights (1)
my.sets <- list(c("A", "B"), c("C","D"), c("D", "E"))
gt(Y, X, subsets = my.sets)
my.weights <- list(1:2, 2:1, 3:2)
gt(Y, X, subsets = my.sets, weights=my.weights)
# Subsets and weights (2)
gt(Y, X, subset = c("A", "B"))
gt(Y, X, subset = c("A", "A", "B"))
gt(Y, X, subset = c("A", "A", "B"), weight = c(.5,.5,1))
# Permutation testing
summary(gt(Y, X, perm=1e4))
|
Loading required package: survival
p-value Statistic Expected Std.dev #Cov
1 7.34e-06 24.3 5.26 2.79 10
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X 10 13.8824 1.38824 6.3608 0.005159 **
Residuals 9 1.9642 0.21825
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
"gt.object" object from package globaltest
Call:
gt(response = Y, alternative = ~A + B, null = ~C + E, data = data.frame(X))
Model: linear regression.
Degrees of freedom: 20 total; 3 null; 3 + 2 alternative.
Null distibution: asymptotic.
p-value Statistic Expected Std.dev #Cov
1 5.57e-05 41.5 5.88 5.71 2
"gt.object" object from package globaltest
Call:
gt(response = Y, alternative = X, null = ~Z)
Model: linear regression.
Degrees of freedom: 20 total; 2 null; 2 + 10 alternative.
Null distibution: asymptotic.
p-value Statistic Expected Std.dev #Cov
1 2.23e-05 24.4 5.56 2.99 10
"gt.object" object from package globaltest
Call:
gt(response = Y, alternative = X, null = Z)
Model: linear regression.
Degrees of freedom: 20 total; 1 null; 1 + 10 alternative.
Null distibution: asymptotic.
p-value Statistic Expected Std.dev #Cov
1 2.55e-05 23.3 5.26 2.83 10
p-value Statistic Expected Std.dev #Cov
1 0.0295 11.4 5.26 2.79 10
p-value Statistic Expected Std.dev #Cov
1 2.05e-07 58.42 5.26 5.58 2
2 7.07e-03 27.54 5.26 5.81 2
3 2.38e-01 7.76 5.26 5.02 2
p-value Statistic Expected Std.dev #Cov
1 5.51e-07 62.43 5.26 6.00 2
2 6.30e-03 30.32 5.26 6.24 2
3 1.80e-01 9.06 5.26 5.23 2
p-value Statistic Expected Std.dev #Cov
1 2.05e-07 58.4 5.26 5.58 2
p-value Statistic Expected Std.dev #Cov
1 1.45e-06 53.8 5.26 5.52 3
p-value Statistic Expected Std.dev #Cov
1 2.05e-07 58.4 5.26 5.58 3
"gt.object" object from package globaltest
Call:
gt(response = Y, alternative = X, permutations = 10000)
Model: linear regression.
Degrees of freedom: 20 total; 1 null; 1 + 10 alternative.
Null distibution: 9999 random permutations.
p-value Statistic Expected Std.dev #Cov
1 1e-04 24.3 5.27 2.69 10
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.