# excessmass: Excess mass In multimode: Mode Testing and Exploring

## Description

This function computes the excess mass statistic.

## Usage

 `1` ```excessmass(data,mod0=1,approximate=FALSE,gridsize=NULL,full.result=F) ```

## Arguments

 `data` Sample for computing the excess mass. `mod0` Number of modes for which the excess mass is calculated. Default `mod0=1`. `approximate` If this argument is TRUE then the excess mass value is approximated. Default `approximate=FALSE`. `gridsize` When `approximate=TRUE`, number of endpoints at which the C_m(λ) sets are estimated (first element) and number of possible values of λ (second element). Default is `gridsize=c(20,20)`. `full.result` If this argument is TRUE then it returns the full result list, see below. Default `full.result=FALSE`.

## Details

With `excessmass`, the excess mass test statistic, introduced by Müller and Sawitzki (1991), for the integer number of modes specified in `mod0` is computed.

The excess mass test statistic for k modes is defined as \max_{λ} \{D_{n,k+1}(λ)\}, where D_{n,k+1}(λ)=(E_{n,k+1}(P_n,λ)-E_{n,k}(P_n,λ)). The empirical excess mass function for k modes is defined as E_{n,k}(P_n,λ)=\sup_{C_1(λ),…,C_k(λ)} \{∑_{m=1}^k P_n (C_m(λ)) - λ ||C_m(λ)|| \}, being the sets C_m(λ) closed intervals with endpoints the data points.

When `mod0>1` and the sample size is large, a two-steps approximation (`approximate=TRUE`) can be performed in order to improve the computing time efficiency. First, since the possible λ candidates to maximize D_{n,k+1}(λ) can be directly obtained from the sets that maximize E_{n,k+1} and E_{n,k} (see Section SM5 of Supplementary Material in Ameijeiras-Alonso et al., 2019), the possible values of λ are computed by looking to the empirical excess mass function in `gridsize[1]` endpoints candidates for C_m(λ) and also in the λ values associated to the empirical excess mass for one mode. Once a λ maximizing the approximated values of D_{n,k+1}(λ) is chosen, in order to obtain the approximation of the excess mass test statistic, in its neighborhood, a grid of possible values of λ is created, being its length equal to `gridsize[2]`, and the exact value of D_{n,k+1}(λ) is calculated for these values of λ (using the algorithm proposed by Müller and Sawitzki, 1991).

If there are repeated data in the sample or the distance between different pairs of data points shows ties, a data perturbation is applied. This modification is made in order to avoid the discretization of the data which has important effects on the computation of the test statistic. The perturbed sample is obtained by adding a sample from the uniform distribution in minus/plus a half of the minimum of the positive distances between two sample points.

The NAs will be automatically removed.

## Value

Depending on `full.result` either a number, the excess mass statistic for `mod0` modes, or an object of class `"estmod"` which is a `list` containing the following components:

 `nmodes` The specified hypothesized value of the number of modes. `sample.size` The number of non-missing observations in the sample used for computing the excess mass. `excess.mass` Value of the excess mass test statistic. `approximate` A logical value indicating if the excess mass was approximated.

## Author(s)

Jose Ameijeiras-Alonso, Rosa M. Crujeiras and Alberto Rodríguez-Casal

## References

Ameijeiras-Alonso, J., Crujeiras, R.M. and Rodríguez-Casal, A. (2019). Mode testing, critical bandwidth and excess mass, Test, 28, 900–919.

Ameijeiras-Alonso, J., Crujeiras, R.M. and Rodríguez-Casal, A. (2021). multimode: An R Package for Mode Assessment, Journal of Statistical Software, 97, 1–32.

Müller, D. W. and Sawitzki, G. (1991). Excess mass estimates and tests for multimodality, The Annals of Statistics, 13, 70–84.

## Examples

 ```1 2 3 4``` ```# Excess mass statistic for one mode set.seed(2016) data=rnorm(50) excessmass(data) ```

### Example output

```[1] 0.07918527
```

multimode documentation built on March 21, 2021, 1:06 a.m.