discr.test.two_sample: Discriminability Two Sample Permutation Test

Description Usage Arguments Value Details Author(s) References Examples

View source: R/discrPermutationTests.R

Description

A function that takes two sets of paired data and tests of whether or not the data is more, less, or non-equally discriminable between the set of paired data.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
discr.test.two_sample(
  X1,
  X2,
  Y,
  dist.xfm = mgc.distance,
  dist.params = list(method = "euclidian"),
  dist.return = NULL,
  remove.isolates = TRUE,
  nperm = 500,
  no_cores = 1,
  alt = "greater"
)

Arguments

X1

is interpreted as a [n x d] data matrix with n samples in d dimensions. Should NOT be a distance matrix.

X2

is interpreted as a [n x d] data matrix with n samples in d dimensions. Should NOT be a distance matrix.

Y

[n] a vector containing the sample ids for our n samples. Should be matched such that Y[i] is the corresponding label for X1[i,] and X2[i,].

dist.xfm

if is.dist == FALSE, a distance function to transform X. If a distance function is passed, it should accept an [n x d] matrix of n samples in d dimensions and return a [n x n] distance matrix as the $D return argument. See mgc.distance for details.

dist.params

a list of trailing arguments to pass to the distance function specified in dist.xfm. Defaults to list(method='euclidean').

dist.return

the return argument for the specified dist.xfm containing the distance matrix. Defaults to FALSE.

is.null(dist.return)

use the return argument directly from dist.xfm as the distance matrix. Should be a [n x n] matrix.

is.character(dist.return) | is.integer(dist.return)

use dist.xfm[[dist.return]] as the distance matrix. Should be a [n x n] matrix.

remove.isolates

remove isolated samples from the dataset. Isolated samples are samples with only one instance of their class appearing in the Y vector. Defaults to TRUE.

nperm

the number of permutations for permutation test. Defualts to 500.

no_cores

the number of cores to use for the permutations. Defaults to 1.

alt

the alternative hypothesis. Can be that first dataset is more discriminable (alt = 'greater'), less discriminable (alt = 'less'), or just non-equal (alt = 'neq'). Defaults to "greater".

Value

A list containing the following:

stat

the observed test statistic. the test statistic is the difference in discriminability of X1 vs X2.

discr

the discriminabilities for each of the two data sets, as a list.

null

the null distribution of the test statistic, computed via permutation.

p.value

The p-value associated with the test.

alt

The alternative hypothesis for the test.

Details

A function that performs a two-sample test for whether the discriminability is different for that of one dataset vs another, as described in Bridgeford et al. (2019). With Dhatx1 the sample discriminability of one approach, and Dhatx2 the sample discriminability of another approach:

H0: Dx1 = Dx2

and:

Ha: Dx1 > Dx2

. Also implemented are tests of < and !=.

Author(s)

Eric Bridgeford

References

Eric W. Bridgeford, et al. "Optimal Decisions for Reference Pipelines and Datasets: Applications in Connectomics." Bioarxiv (2019).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Not run: 
require(mgc)
require(MASS)

n = 100; d=5

# generate two subjects truths; true difference btwn
# subject 1 (column 1) and subject 2 (column 2)
mus <- cbind(c(0, 0), c(1, 1))
Sigma <- diag(2)  # dimensions are independent

# first dataset X1 contains less noise than X2
X1 <- do.call(rbind, lapply(1:dim(mus)[2],
  function(k) {mvrnorm(n=50, mus[,k], 0.5*Sigma)}))
X2 <- do.call(rbind, lapply(1:dim(mus)[2],
  function(k) {mvrnorm(n=50, mus[,k], 2*Sigma)}))
Y <- do.call(c, lapply(1:2, function(i) rep(i, 50)))

# X1 should be more discriminable, as less noise
discr.test.two_sample(X1, X2, Y, alt="greater")$p.value  # p-value is small

## End(Not run)

neurodata/r-mgc documentation built on March 12, 2021, 9:45 a.m.