t-tests and F-tests for rows or columns of a matrix

Share:

Description

t-tests and F-tests for rows or columns of a matrix, intended to be speed efficient.

Usage

1
2
3
4
5
6
rowttests(x, fac, tstatOnly = FALSE) 
colttests(x, fac, tstatOnly = FALSE)
fastT(x, ig1, ig2, var.equal = TRUE)

rowFtests(x, fac, var.equal = TRUE)
colFtests(x, fac, var.equal = TRUE)

Arguments

x

Numeric matrix. The matrix must not contain NA values. For rowttests and colttests, x can also be an ExpressionSet.

fac

Factor which codes the grouping to be tested. There must be 1 or 2 groups for the t-tests (corresponding to one- and two-sample t-test), and 2 or more for the F-tests. If fac is missing, this is taken as a one-group test (i.e. is only allowed for the t-tests). The length of the factor needs to correspond to the sample size: for the row* functions, the length of the factor must be the same as the number of columns of x, for the col* functions, it must be the same as the number of rows of x.

If x is an ExpressionSet, then fac may also be a character vector of length 1 with the name of a covariate in x.

tstatOnly

A logical variable indicating whether to calculate p-values from the t-distribution with appropriate degrees of freedom. If TRUE, just the t-statistics are returned. This can be considerably faster.

ig1

The indices of the columns of x that correspond to group 1.

ig2

The indices of the columns of x that correspond to group 2.

var.equal

A logical variable indicating whether to treat the variances in the samples as equal. If 'TRUE', a simple F test for the equality of means in a one-way analysis of variance is performed. If 'FALSE', an approximate method of Welch (1951) is used, which generalizes the commonly known 2-sample Welch test to the case of arbitrarily many samples.

Details

If fac is specified, rowttests performs for each row of x a two-sided, two-class t-test with equal variances. fac must be a factor of length ncol(x) with two levels, corresponding to the two groups. The sign of the resulting t-statistic corresponds to "group 1 minus group 2". If fac is missing, rowttests performs for each row of x a two-sided one-class t-test against the null hypothesis 'mean=0'.

rowttests and colttests are implemented in C and should be reasonably fast and memory-efficient. fastT is an alternative implementation, in Fortran, possibly useful for certain legacy code. rowFtests and colFtests are currently implemented using matrix algebra in R. Compared to the rowttests and colttests functions, they are slower and use more memory.

Value

A data.frame with columns statistic, p.value (optional in the case of the t-test functions) and dm, the difference of the group means (only in the case of the t-test functions). The row.names of the data.frame are taken from the corresponding dimension names of x.

The degrees of freedom are provided in the attribute df. For the F-tests, if var.equal is 'FALSE', nrow(x)+1 degree of freedoms are given, the first one is the first degree of freedom (it is the same for each row) and the other ones are the second degree of freedom (one for each row).

Author(s)

Wolfgang Huber <whuber@embl.de>

References

B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, *38*, 330-336

See Also

mt.teststat

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
   ##
   ## example data
   ##
   x  = matrix(runif(40), nrow=4, ncol=10)
   f2 = factor(floor(runif(ncol(x))*2))
   f4 = factor(floor(runif(ncol(x))*4))

   ##
   ## one- and two group row t-test; 4-group F-test
   ##
   r1 = rowttests(x)
   r2 = rowttests(x, f2)
   r4 = rowFtests(x, f4)

   ## approximate equality
   about.equal = function(x,y,tol=1e-10)
     stopifnot(is.numeric(x), is.numeric(y), length(x)==length(y), all(abs(x-y) < tol))

   ##
   ## compare with the implementation in t.test
   ##
   for (j in 1:nrow(x)) {
     s1 = t.test(x[j,])
     about.equal(s1$statistic, r1$statistic[j])
     about.equal(s1$p.value,   r1$p.value[j])

     s2 = t.test(x[j,] ~ f2, var.equal=TRUE)
     about.equal(s2$statistic, r2$statistic[j])
     about.equal(s2$p.value,   r2$p.value[j])

     dm = -diff(tapply(x[j,], f2, mean))
     about.equal(dm, r2$dm[j])

     s4 = summary(lm(x[j,] ~ f4))
     about.equal(s4$fstatistic["value"], r4$statistic[j])
   }

   ##
   ## colttests
   ##
   c2 = colttests(t(x), f2)
   stopifnot(identical(r2, c2))

   ##
   ## missing values
   ##
   f2n = f2
   f2n[sample(length(f2n), 3)] = NA
   r2n = rowttests(x, f2n)
   for(j in 1:nrow(x)) {
     s2n = t.test(x[j,] ~ f2n, var.equal=TRUE)
     about.equal(s2n$statistic, r2n$statistic[j])
     about.equal(s2n$p.value,   r2n$p.value[j])
   }

   ##
   ## larger sample size
   ##
   x  = matrix(runif(1000000), nrow=4, ncol=250000)
   f2 = factor(floor(runif(ncol(x))*2))
   r2 = rowttests(x, f2) 
   for (j in 1:nrow(x)) {
     s2 = t.test(x[j,] ~ f2, var.equal=TRUE)
     about.equal(s2$statistic, r2$statistic[j])
     about.equal(s2$p.value,   r2$p.value[j])
   }

   ## single row matrix
   rowFtests(matrix(runif(10),1,10),as.factor(c(rep(1,5),rep(2,5))))
   rowttests(matrix(runif(10),1,10),as.factor(c(rep(1,5),rep(2,5))))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.