run_drt | R Documentation |
Performs two (or more) sample doubly ranked tests on pre-processed functional data, formatted as either a matrix (for functions) or an array (for surfaces).
run_drt(X, G, method = c("suff.rank", "avg.rank"), data.names = NULL)
## Default S3 method:
run_drt(X, G, method = c("suff.rank", "avg.rank"), data.names = NULL)
## S3 method for class 'formula'
run_drt(formula, ...)
X |
an n by T matrix or an S by T by n array containing the functions (or surfaces) to analyze. |
G |
a vector of length n containing the grouping variable. |
method |
statistic for summarizing the ranks: 'suff.rank' for sufficient statistic (the default) or 'avg.rank' for arithmetic average. |
data.names |
a vector of length two containing names that describe |
formula |
a formula of the form |
... |
additional arguments to pass to |
Doubly ranked tests are non-parametric tests that first rank functions (or surfaces) by time (or location). Next, the procedure summarizes the observed ranks using a statistic. The summarized ranks are then analyzed using either a Wilcoxon rank sum test or a Kruskal-Wallis test. To perform a doubly ranked test, realizations of functions must be stored in an n by T matrix where n is the total number of observed functions and T is the number of realizations per function (commonly time points or locations). Surface data in an S by T by n array can be analyzed as well, although currently this feature has under gone only limited testing.
By default, run_drt()
implements a sufficient statistic when summarizing the ranks of each observed function across T, i.e.
the argument method
defaults to method = suff.rank
. This statistic has the form
t(z) = \frac{1}{T}\sum_{t=1}^T\log\left[ \left(\frac{z_t}{n}- \frac{1}{2n}\right)\bigg/\left(1-\frac{z_t}{n} + \frac{1}{2n}\right) \right],
where z_t
is the observed rank at time t
. See Meyer (2024) for additional details. The average rank may also be
used by setting method = 'avg.rank'
, although this summary has not undergone testing in the doubly ranked context.
Regardless of the statistic used, the summarized ranks are the analyzed using either wilcox.test()
or kruskal.test()
,
depending on the number of groups in G
.
For functional data, Meyer (2024) suggests using refund::fpca.face()
for pre-processing the data, but X
can be pre-processed using any functional
data approach or it can just be the raw data. run_drt()
itself performs no pre-processing and takes X
as inputted.
A list with class "htest
" containing the following components:
statistic | the value of the test statistic with a name describing it. |
parameter | the parameter(s) for the exact distribution of the test statistic. |
p.value | the p-value for the test. |
null.value | the location parameter. |
alternative | a character string describing the alternative hypothesis. |
data.name | a character string giving the names of the data. |
test_details | the output from the internally run Wilcoxon rank sum or Kruskal-Wallis test. |
method | character string giving the type of doubly ranked test performed. |
ranks | a list containing the ranks by column (if X is a matrix) and the summarized ranks. |
data | a list containing X and G . |
Meyer, MJ (2024). Doubly ranked tests for grouped functional data. Available on arXiv at https://arxiv.org/abs/2306.14761.
#### Two Sample Problem: Resin Viscosity ####
library(FDboost)
data("viscosity")
Xv <- matrix(viscosity$visAll, nrow = nrow(viscosity$visAll), ncol = ncol(viscosity$visAll))
fXv <- refund::fpca.face(Xv)
Yvis <- fXv$Yhat
TR <- viscosity$T_A
run_drt(Yvis ~ TR)
#### Four Sample Problem: Canadian Weather ####
R <- fda::CanadianWeather$region
XT <- t(fda::CanadianWeather$dailyAv[,,'Temperature.C'])
fXT <- refund::fpca.face(XT)
YT <- fXT$Yhat
run_drt(YT ~ R)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.