permutation.test: Two-sample permutation test

Description Usage Arguments Value References Examples

View source: R/functions.R

Description

Performs a permutation test to assess whether two samples of Persistence Flamelets are likely be random draw from the same distribution or if they come from different generating mechanisms, with p-value computed by means of bootstrap.

Usage

1
permutation.test(sample1, sample2, n.rep = NULL, seed = NULL)

Arguments

sample1

a k x m x n1 array corresponding to the Persistence Flamelet for the n1 subjects in sample 1

sample2

a k x m x n2 array corresponding to the Persistence Flamelet for the n2 subjects in sample 2

n.rep

an integer representing the number of bootstrap iterations. If NULL only the test statistics on the observed samples is returned

seed

an integer specifying a seed for the random shuffling

Value

a bootstrapped p-value

References

T. Padellini and P. Brutti (2017) Persistence Flamelets: multiscale Persistent Homology for kernel density exploration https://arxiv.org/abs/1709.07097

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
library(eegkit)
library(dplyr)

# import data from the eegkit package
data("eegdata")   # electroencephalography data
data("eegcoord")  # location of the electrodes
# add eeg channel name as variable and select only the 2d projection of the electrode location
eegcoord <- mutate(eegcoord, channel = rownames(eegcoord)) %>% select(channel, xproj, yproj)

# as EEG recordings are extremely unstable, they are typically averaged across repetitions
# here we average them across the 5 trials from each subject
eegmean <- eegdata %>% group_by(channel, time, subject) %>% summarise(mean = mean(voltage))
dim(eegmean) # 64 channels x 256 time points x 20 subjects

subjects <- unique(eegdata$subject)
# subjects 1:10 are alcoholic, 11:20 are control

# eegmean2 <- tapply(eegdata$voltage, list(eegdata$channel, eegdata$time, eegdata$subject), mean)
# Start by computing the list of Persistence Diagrams needed to build the flamelet for each subject
diag.list <- list()
t0 <- Sys.time()
for (sbj in subjects){

  # select signal for one subject and then remove channels for which there are no coordinates
  sbj.data = dplyr::filter(eegmean, subject == sbj, !(channel %in% c("X", "Y", "nd")  ))

  # add 2d projection of electrodes location
  sbj.data = left_join(sbj.data, eegcoord, by = "channel")

  # scale data
  sbj.data[, c(4:6)] = scale(sbj.data[,c(4:6)])


  # dsucc.List = list()

  diag.list.sbj = lapply(unique(sbj.data$time), function(time.idx){
    time.idx.data = filter(sbj.data, time == time.idx) %>% ungroup %>%
                    select(mean, xproj, yproj)
    time.idx.diag = ripsDiag(time.idx.data, maxdimension = 1, maxscale = 5,
                             library = "GUDHI", printProgress = F)
    return(time.idx.diag$diagram)
  })

  diag.list[[which(sbj== subjects)]] = diag.list.sbj

  print(paste("subject ", which(sbj == subjects), " of 20"))


}
t1 <- Sys.time()-t0
t1 # will take less than 5 minutes

tseq <- seq(0, 5, length = 500) # consider 5 as it is the
# same value as maxscale (hence largest possible persistence)

p_silh0 <- sapply(diag.list, FUN = build.flamelet,
base.type = "silhouette", dimension = 0,
tseq = tseq, precomputed.diagram = TRUE, simplify = 'array')

prova = permutation_test(p_sih0[,,1:10], p_silh0[,,11:20],  n.rep = 10, seed = 1)

pflamelet documentation built on Dec. 17, 2020, 5:08 p.m.