forder: Functional ordering

View source: R/forder.r

forderR Documentation

Functional ordering

Description

Calculates different measures for ordering the functions (or vectors) from the most extreme to least extreme one

Usage

forder(
  curve_sets,
  measure = "erl",
  scaling = "qdir",
  alternative = c("two.sided", "less", "greater"),
  use_theo = TRUE,
  probs = c(0.025, 0.975),
  quantile.type = 7
)

Arguments

curve_sets

A curve_set object or a list of curve_set objects. Also envelope objects of spatstat and fdata of fda.usc are accepted instead of curve_set objects.

measure

The measure to use to order the functions from the most extreme to the least extreme one. Must be one of the following: 'rank', 'erl', 'cont', 'area', 'max', 'int', 'int2'. Default is 'erl'.

scaling

The name of the scaling to use if measure is 'max', 'int' or 'int2'. Options include 'none', 'q', 'qdir' and 'st', where 'qdir' is the default.

alternative

A character string specifying the alternative hypothesis. Must be one of the following: "two.sided" (default), "less" or "greater". The last two options only available for types 'rank', 'erl', 'cont' and 'area'.

use_theo

Logical. When calculating the measures 'max', 'int', 'int2', should the theoretical function from curve_set be used (if 'theo' provided), see deviation_test.

probs

A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllymäki et al. (2015, 2017).

quantile.type

As type argument of quantile, how to calculate quantiles for 'q' or 'qdir'.

Details

Given a curve_set object or an envelope object of spatstat, which contains curves T_1(r),\dots,T_s(r), the functions are ordered from the most extreme one to the least extreme one by one of the following measures (specified by the argument measure). Note that 'erl', 'cont' and 'area' were proposed as a refinement to the extreme ranks 'rank', because the extreme ranks can contain many ties. All of these completely non-parametric measures are smallest for the most extreme functions and largest for the least extreme ones, whereas the deviation measures ('max', 'int' and 'int2') obtain largest values for the most extreme functions.

  • 'rank': extreme rank (Myllymäki et al., 2017). The extreme rank R_i is defined as the minimum of pointwise ranks of the curve T_i(r), where the pointwise rank is the rank of the value of the curve for a specific r-value among the corresponding values of the s other curves such that the lowest ranks correspond to the most extreme values of the curves. How the pointwise ranks are determined exactly depends on the whether a one-sided (alternative is "less" or "greater") or the two-sided test (alternative="two.sided") is chosen.

  • 'erl': extreme rank length (Myllymäki et al., 2017). Considering the vector of pointwise ordered ranks \mathbf{R}_i of the ith curve, the extreme rank length measure R_i^{erl} is equal to

    R_i^{erl} = \frac{1}{s}\sum_{j=1}^{s} \mathbf{1}(\mathbf{R}_j "<" \mathbf{R}_i)

    where \mathbf{R}_j "<" \mathbf{R}_i if and only if there exists n\leq d such that for the first k, k<n, pointwise ordered ranks of \mathbf{R}_j and \mathbf{R}_i are equal and the n'th rank of \mathbf{R}_j is smaller than that of \mathbf{R}_i. The scaling by

    s

    is applied to normalize the ranks following Mrkvička et al. (2019) and Narisetty and Nair (2016).

  • 'cont': continuous rank (Hahn, 2015; Mrkvička et al., 2019) based on minimum of continuous pointwise ranks

  • 'area': area rank (Mrkvička et al., 2019) based on area between continuous pointwise ranks and minimum pointwise ranks for those argument (r) values for which pointwise ranks achieve the minimum (it is a combination of erl and cont)

  • 'max' and 'int' and 'int2': Further options for the measure argument that can be used together with scaling. See the help in deviation_test for these options of measure and scaling. These measures are largest for the most extreme functions and smallest for the least extreme ones. The arguments use_theo and probs are relevant for these measures only (otherwise ignored).

For details see Myllymäki and Mrkvička et al. (2020, Section 2)

Value

A vector containing one of the above mentioned measures k for each of the functions in the curve set. If the component obs in the curve set is a vector, then its measure will be the first component (named 'obs') in the returned vector.

References

Hahn U (2015). “A note on simultaneous Monte Carlo tests.” Technical report, Centre for Stochastic Geometry and advanced Bioimaging, Aarhus University.

Mrkvička, T., Myllymäki, M., Jilek, M. and Hahn, U. (2020) A one-way ANOVA test for functional data with graphical interpretation. Kybernetika 56(3), 432-458. doi: 10.14736/kyb-2020-3-0432

Mrkvička, T., Myllymäki, M., Kuronen, M. and Narisetty, N. N. (2022) New methods for multiple testing in permutation inference for the general linear model. Statistics in Medicine 41(2), 276-297. doi: 10.1002/sim.9236

Myllymäki, M., Grabarnik, P., Seijo, H. and Stoyan. D. (2015). Deviation test construction and power comparison for marked spatial point patterns. Spatial Statistics 11, 19-34. doi: 10.1016/j.spasta.2014.11.004

Myllymäki, M., Mrkvička, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79, 381-404. doi: 10.1111/rssb.12172

Narisetty, N. N. and Nair, V. J. (2016) Extremal depth for functional data and applications. Journal of the American Statistical Association 111, 1705-1714.

See Also

partial_forder

Examples

if(requireNamespace("fda", quietly = TRUE)) {
  # Consider ordering of the girls in the Berkeley Growth Study data
  # available from the R package fda, see ?growth, according to their
  # annual heights or/and changes within years.
  # First create sets of curves (vectors), for raw heights and
  # for the differences within the years
  years <- paste(1:18)
  curves <- fda::growth[['hgtf']][years,]
  cset1 <- curve_set(r = as.numeric(years),
                     obs = curves)
  cset2 <- curve_set(r = as.numeric(years[-1]),
                     obs = curves[-1,] - curves[-nrow(curves),])

  # Order the girls from most extreme one to the least extreme one, below using the 'area' measure
  # a) according to their heights
  forder(cset1, measure = 'area')
  # Print the 10 most extreme girl indices
  order(forder(cset1, measure = 'area'))[1:10]
  # b) according to the changes (print indices)
  order(forder(cset2, measure = 'area'))[1:10]
  # c) simultaneously with respect to heights and changes (print indices)
  csets <- list(Height = cset1, Change = cset2)
  order(forder(csets, measure = 'area'))[1:10]
}

myllym/GET documentation built on May 5, 2024, 2:16 a.m.