fanova.tests: Tests for FANOVA Problem
In fdANOVA: Analysis of Variance for Univariate and Multivariate Functional Data

Description Usage Arguments Details Value Author(s) References See Also Examples

Performs the testing procedures for the one-way analysis of variance for (univariate) functional data (FANOVA). See Section 2.1 of the vignette file (vignette("fdANOVA", package = "fdANOVA")), for details of the tests.

We consider the l groups of independent random functions X_{ij}(t), i=1,…,l, j=1,…,n_i defined over a closed and bounded interval I=[a,b]. Let n=n_1+…+n_l. These groups may differ in mean functions, i.e., we assume that X_{ij}(t), j=1,…,n_i are stochastic processes with mean function μ_i(t), t\in I and covariance function γ(s, t), s,t\in I, for i=1,…,l. Of interest is to test the following null hypothesis

H_0:μ_1(t)=…=μ_l(t),\ t\in I.

The alternative is the negation of the null hypothesis. We assume that each functional observation is observed on a common grid of \mathcal{T} design time points equally spaced in I (see Section 3.1 of the vignette file, vignette("fdANOVA", package = "fdANOVA")).

fanova.tests(x = NULL, group.label, test = "ALL",
             params = NULL,
             parallel = FALSE, nslaves = NULL)

# more detailed usage of params:
# params = list(paramFP = list(int, B.FP = 1000,
#                              basis = c("Fourier", "b-spline", "own"),
#                              own.basis, own.cross.prod.mat,
#                              criterion = c("BIC", "eBIC", "AIC", "AICc", "NO"),
#                              commonK = c("mode", "min", "max", "mean"),
#                              minK = NULL, maxK = NULL, norder = 4, gamma.eBIC = 0.5)
#               paramCH = 10000,
#               paramCS = 10000,
#               paramL2b = 10000,
#               paramFb = 10000,
#               paramFmaxb = 10000,
#               paramTRP = list(k = 30, projection = c("GAUSS", "BM"),
#                               permutation = FALSE, B.TRP = 10000,
#                               independent.projection.tests = TRUE))

`x`	a \mathcal{T}\times n matrix of data, whose each column is a discretized version of a function and rows correspond to design time points. Its default values is `NULL`, since if the FP test is only used, we can give a basis representation of the data instead of raw observations (see the list `paramFP` below). For any of the other testing procedures, the raw data are needed.
`group.label`	a vector containing group labels.
`test`	a kind of indicator which establishes a choice of FANOVA tests to be performed. Its default value means that all testing procedures of Section 2.1 of the vignette file will be used. When we want to use only some tests, the parameter `test` is an appropriate subvector of the following vector of tests' labels `c("FP",` `"CH",` `"CS",` `"L2N",` `"L2B",` `"L2b",` `"FN",` `"FB",` `"Fb",` `"GPF",` `"Fmaxb",` `"TRP")`, where `"FP"` - permutation test based on basis function representation (Gorecki and Smaga, 2015); `"CH"` and `"CS"` - L2-norm-based parametric bootstrap tests for homoscedastic and heteroscedastic samples, respectively (Cuevas et al., 2004); `"L2N"` and `"L2B"` - L2-norm-based test with naive and bias-reduced method of estimation, respectively (Faraway, 1997; Zhang and Chen, 2007; Zhang, 2013); `"L2b"` - L2-norm-based bootstrap test (Zhang, 2013); `"FN"` and `"FB"` - F-type test with naive and bias-reduced method of estimation, respectively (Shen and Faraway, 2004; Zhang, 2011); `"Fb"` - F-type bootstrap test (Zhang, 2013); `"GPF"` - globalizing the pointwise F-test (Zhang and Liang, 2014); `"Fmaxb"` - Fmax bootstrap test (Zhang et al., 2018); `"TRP"` - tests based on random projections (Cuesta-Albertos and Febrero-Bande, 2010).
`params`	a list of additional parameters for the FP, CH, CS, L^2b, Fb, Fmaxb tests and the tests based on random projections. It can contain all or a part of the elements `paramFP`, `paramCH`, `paramCS`, `paramL2b`, `paramFb`, `paramFmaxb` and `paramTRP` for passing the parameters for the FP, CH, CS, L^2b, Fb, Fmaxb tests and tests based on random projections, respectively, to the function `fanova.tests`. They are described below. The default value of `params` means that these tests are performed with their default values.
`paramFP`	a list containing the parameters for the FP test.
`int`	a vector of two elements representing the interval I=[a,b]. When it is not specified, it is determined by a number of design time points.
`B.FP`	a number of permutation replicates for the FP tests.
`basis`	a choice of basis of functions used in the basis function representation of the data.
`own.basis`	if `basis = "own"`, a K\times n matrix with columns containing the coefficients of the basis function representation of the observations.
`own.cross.prod.mat`	if `basis = "own"`, a K\times K cross product matrix corresponding to a basis used to obtain the matrix `own.basis`.
`criterion`	a choice of information criterion for selecting the optimum value of K. `criterion = "NO"` means that K is equal to the parameter `maxK` defined below. We have \code{BIC}(X_{ij})=\mathcal{T}\log(\mathbf{e}_{ij}^{\top}\mathbf{e}_{ij}/\mathcal{T})+K\log\mathcal{T}, \code{eBIC}(X_{ij})=\mathcal{T}\log(\mathbf{e}_{ij}^{\top}\mathbf{e}_{ij}/\mathcal{T})+K[\log\mathcal{T}+2γ\log(K_{\max})], \code{AIC}(X_{ij})=\mathcal{T}\log(\mathbf{e}_{ij}^{\top}\mathbf{e}_{ij}/\mathcal{T})+2K and \code{AICc}(X_{ij})=\code{AIC}(X_{ij})+2K(K + 1)/(n-K-1), where \mathbf{e}_{ij}=(e_{ij1},…,e_{ij\mathcal{T}})^{\top}, e_{ijr}=X_{ij}(t_r)-∑_{m=1}^K\hat{c}_{ijm}\varphi_m(t_r), t_1,…,t_{\mathcal{T}} are the design time points, γ\in[0,1], K_{\max} is a maximum K considered and \log denotes the natural logarithm.
`commonK`	a choice of method for selecting the common value for all observations from the values of K corresponding to all processes.
`minK`	a minimum value of K. When `basis = "Fourier"`, it has to be an odd number. If `minK = NULL`, we take `minK = 3`. For `basis = "b-spline"`, `minK` has to be greater than or equal to `norder` defined below. If `minK = NULL` or `minK < norder`, then we take `minK = norder`.
`maxK`	a maximum value of K. When `basis = "Fourier"`, it has to be an odd number. If `maxK = NULL`, we take `maxK` equal to the largest odd number smaller than the number of design time points. If `maxK` is greater than or equal to the number of design time points, `maxK` is taken as above. For `basis = "b-spline"`, `maxK` has to be smaller than or equal to the number of design time points. If `maxK = NULL` or `maxK` is greater than the number of design time points, then we take `maxK` equal to the number of design time points.
`norder`	if `basis = "b-spline"`, an integer specifying the order of b-splines.
`gamma.eBIC`	a γ\in[0,1] parameter in the eBIC.
`paramCH`	a number of discretized artificial trajectories for generating Gaussian processes for the CH test.
`paramCS`	a number of discretized artificial trajectories for generating Gaussian processes for the CS test.
`paramL2b`	a number of bootstrap samples for the L^2b test.
`paramFb`	a number of bootstrap samples for the Fb test.
`paramFmaxb`	a number of bootstrap samples for the Fmaxb test.
`paramTRP`	a list containing the parameters of the tests based on random projections.
`k`	a vector of numbers of projections.
`projection`	a method of generating Gaussian processes in step 1 of the tests based on random projections presented in Section 2 of the vignette file. If `projection = "GAUSS"`, the Gaussian white noise is generated as in the function `anova.RPm` from the R package fda.usc. In the second case, the Brownian motion is generated.
`permutation`	a logical indicating whether to compute p-values of the tests based on random projections by permutation method.
`B.TRP`	a number of permutation replicates for the tests based on random projections.
`independent.projection.tests`	a logical indicating whether to generate the random projections independently or dependently for different elements of vector `k`. In the first case, the random projections for each element of vector `k` are generated separately, while in the second one, they are generated as chained subsets, e.g., for `k = c(5, 10)`, the first 5 projections are a subset of the last 10. The second way of generating random projections is faster than the first one.
`parallel`	a logical indicating whether to use parallelization.
`nslaves`	if `parallel = TRUE`, a number of slaves. Its default value means that it will be equal to a number of logical processes of a computer used.

To perform step 3 of the projection procedure given in Section 2.1 of the vignette file, we use five tests: the standard (paramTRP$permutation = FALSE) and permutation (paramTRP$permutation = TRUE) tests based on ANOVA F-test statistic and ANOVA-type statistic (ATS) proposed by Brunner et al. (1997), as well as the testing procedure based on Wald-type permutation statistic (WTPS) of Pauly et al. (2015).

A list with class "fanovatests" containing the following components (|k| denotes the length of vector k):

`FP`	a list containing value of test statistic `statFP`, p-value `pvalueFP` and used parameters for the FP test. The chosen optimal length of basis expansion `K` is also given there.
`CH`	a list containing value of test statistic `statCH`, p-value `pvalueCH` and used parameter `paramCH` for the CH test.
`CS`	a list containing value of test statistic `statCS`, p-value `pvalueCS` and used parameter `paramCS` for the CS test.
`L2N`	a list containing value of test statistic `statL2`, p-value `pvalueL2N` and values of estimators `betaL2N` and `dL2N` used in approximation of null distribution of test statistic for the L^2N test.
`L2B`	a list containing value of test statistic `statL2`, p-value `pvalueL2B` and values of estimators `betaL2B` and `dL2B` used in approximation of null distribution of test statistic for the L^2B test.
`L2b`	a list containing value of test statistic `statL2`, p-value `pvalueL2b` and used parameter `paramL2b` for the L^2b test.
`FN`	a list containing value of test statistic `statF`, p-value `pvalueFN` and values of estimators `d1FN` and `d2FN` used in approximation of null distribution of test statistic for the FN test.
`FB`	a list containing value of test statistic `statF`, p-value `pvalueFB` and values of estimators `d1FB` and `d2FB` used in approximation of null distribution of test statistic for the FB test.
`Fb`	a list containing value of test statistic `statF`, p-value `pvalueFb` and used parameter `paramFb` for the Fb test.
`GPF`	a list containing value of test statistic `statGPF`, p-value `pvalueGPF` and values of estimators `betaGPF` and `dGPF` used in approximation of null distribution of test statistic for the GPF test.
`Fmaxb`	a list containing value of test statistic `statFmax`, p-value `pvalueFmaxb` and used parameter `paramFmaxb` for the Fmaxb test.
`TRP`	a list containing the following elements: vectors `pvalues.anova`, `pvalues.ATS`, `pvalues.WTPS` of length \|`k`\| containing p-values for tests based on random projections and for numbers of projections given in `k`; if `independent.projection.tests = TRUE`, a list `data.projections` of length \|`k`\|, whose ith element is an n\times `k[i]` matrix with columns being projections of the data; when `independent.projection.tests = FALSE`, an n\times \max(`k`) matrix `data.projections` with columns being projections of the data; used parameters for the tests based on random projections.

and the values of other used parameters: data = x, group.label, etc.

Tomasz Gorecki, Lukasz Smaga

Brunner E, Dette H, Munk A (1997). Box-Type Approximations in Nonparametric Factorial Designs. Journal of the American Statistical Association 92, 1494-1502.

Cuesta-Albertos JA, Febrero-Bande M (2010). A Simple Multiway ANOVA for Functional Data. Test 19, 537-557.

Cuevas A, Febrero M, Fraiman R (2004). An Anova Test for Functional Data. Computational Statistics & Data Analysis 47, 111-122.

Faraway J (1997). Regression Analysis for a Functional Response. Technometrics 39, 254-261.

Gorecki T, Smaga L (2015). A Comparison of Tests for the One-Way ANOVA Problem for Functional Data. Computational Statistics 30, 987-1010.

Gorecki T, Smaga L (2017). Multivariate Analysis of Variance for Functional Data. Journal of Applied Statistics 44, 2172-2189.

Pauly M, Brunner E, Konietschke F (2015). Asymptotic Permutation Tests in General Factorial Designs. Journal of the Royal Statistical Society Series B 77, 461-473.

Shen Q, Faraway J (2004). An F Test for Linear Models with Functional Responses. Statistica Sinica 14, 1239-1257.

Zhang JT (2011). Statistical Inferences for Linear Models with Functional Responses. Statistica Sinica 21, 1431-1451.

Zhang JT (2013). Analysis of Variance for Functional Data. Chapman & Hall, London.

Zhang JT, Chen JW (2007). Statistical Inferences for Functional Data. The Annals of Statistics 35, 1052-1079.

Zhang JT, Cheng MY, Wu HT, Zhou B (2018). A New Test for Functional One-way ANOVA with Applications to Ischemic Heart Screening. Computational Statistics and Data Analysis https://doi.org/10.1016/j.csda.2018.05.004

Zhang JT, Liang X (2014). One-Way ANOVA for Functional Data via Globalizing the Pointwise F-Test. Scandinavian Journal of Statistics 41, 51-71.

fmanova.ptbfr, fmanova.trp, plotFANOVA, plot.fanovatests

# Some of the examples may run some time.

# gait data (the first feature)
library(fda)
gait.data.frame <- as.data.frame(gait)
x.gait <- as.matrix(gait.data.frame[, 1:39])

# vector of group labels
group.label.gait <- rep(1:3, each = 13)

# all FANOVA tests with default parameters
set.seed(123)
(fanova1 <- fanova.tests(x = x.gait, group.label = group.label.gait))
summary(fanova1)
# data projections generated in the test based on random projections
fanova1$TRP$data.projections

# only three tests with non-default parameters
set.seed(123)
fanova2 <- fanova.tests(x.gait, group.label.gait,
                        test = c("FP", "GPF", "Fmaxb"),
                        params = list(paramFP = list(int = c(0.025, 0.975),
                                                     B.FP = 1000, basis = "b-spline",
                                                     criterion = "eBIC",
                                                     commonK = "mean",
                                                     minK = 5, maxK = 20,
                                                     norder = 4, gamma.eBIC = 0.7),
                                      paramFmaxb = 1000))
summary(fanova2)

# the FP test with predefined basis function representation
library(fda)
fbasis <- create.bspline.basis(rangeval = c(0.025, 0.975), 19, norder = 4)
own.basis <- Data2fd(seq(0.025, 0.975, length = 20), x.gait, fbasis)$coefs
own.cross.prod.mat <- inprod(fbasis, fbasis)
set.seed(123)
fanova3 <- fanova.tests(group.label = group.label.gait, test = "FP",
                        params = list(paramFP = list(B.FP = 1000, basis = "own",
                                                     own.basis = own.basis,
                                                     own.cross.prod.mat = own.cross.prod.mat)))
summary(fanova3)

# the tests based on random projections with the Gaussian white noise generated for projections
set.seed(123)
fanova4 <- fanova.tests(x.gait, group.label.gait, test = "TRP",
                        parallel = TRUE, nslaves = 2,
                        params = list(paramTRP = list(k = c(10, 20, 30), B.TRP = 1000)))
summary(fanova4)
set.seed(123)
fanova5 <- fanova.tests(x.gait, group.label.gait, test = "TRP",
                        parallel = TRUE, nslaves = 2,
                        params = list(paramTRP = list(k = c(10, 20, 30),
                                                      permutation = TRUE, B.TRP = 1000)))
summary(fanova5)

# the tests based on random projections with the Brownian motion generated for projections
set.seed(123)
fanova6 <- fanova.tests(x.gait, group.label.gait, test = "TRP",
                        parallel = TRUE, nslaves = 2,
                        params = list(paramTRP = list(k = c(10, 20, 30), projection = "BM",
                                                      B.TRP = 1000)))
summary(fanova6)
set.seed(123)
fanova7 <- fanova.tests(x.gait, group.label.gait, test = "TRP",
                        parallel = TRUE, nslaves = 2,
                        params = list(paramTRP = list(k = c(10, 20, 30), projection = "BM",
                                                      permutation = TRUE, B.TRP = 1000)))
summary(fanova7)