multivariate_outliergram: Outliergram for multivariate functional datasets

Description Usage Arguments Details Adjustment References See Also Examples

View source: R/outliergram.R

Description

This function performs the outliergram of a multivariate functional dataset.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
multivariate_outliergram(
  mfData,
  MBD_data = NULL,
  MEI_data = NULL,
  weights = "uniform",
  p_check = 0.05,
  Fvalue = 1.5,
  shift = TRUE,
  display = TRUE,
  xlab = NULL,
  ylab = NULL,
  main = NULL
)

Arguments

mfData

the multivariate functional dataset whose outliergram has to be determined;

MBD_data

a vector containing the MBD for each element of the dataset; If missing, MBDs are computed with the specified choice of weights;

MEI_data

a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed;

weights

the weights choice to be used to compute multivariate MBDs and MEIs;

p_check

percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset).

Fvalue

the F value to be used in the procedure that finds the shape outliers by looking at the lower parabolic limit in the outliergram. Default is 1.5;

shift

whether to apply the shifting algorithm to properly manage observations having low or high MEI. Default is TRUE.

display

either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed;

xlab

the label to use on the x axis in the outliergram plot;

ylab

the label to use on the x axis in the outliergram plot;

main

the title to use in the outliergram;

Details

The method applies the extension of the univariate outliergram to the case of multivariate functional datasets. Differently from the function for the univariate case, only the outliergram plot is displayed.

Adjustment

Differently from the case of univariate functional data, in this case the function does not apply an automatic tuning of the F parameter, since the related procedure would become computationally too heavy for general datasets. If a good value of F is sought, it is recommended to run several trials of the outliergram and manually select the best value.

References

Ieva, F. & Paganoni, A.M. Stat Papers (2017). https://doi.org/10.1007/s00362-017-0953-1.

See Also

outliergram, mfData, MBD, MEI

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
N = 2e2
P = 1e2

t0 = 0
t1 = 1

set.seed(1)

# Defining the measurement grid
grid = seq( t0, t1, length.out = P )

# Generating an exponential covariance matrix to be used in the simulation of
# the functional datasets (see the related help for details)
C = exp_cov_function( grid, alpha = 0.3, beta = 0.2)

# Simulating the measurements of two univariate functional datasets with
# required center and covariance function
f1 = function(x) x * ( 1 - x )
f2 = function(x) x^3
Data = generate_gauss_mfdata( N, L = 2,
                              centerline = matrix(c(sin(2 * pi * grid),
                                                    cos(2 * pi * grid)), nrow=2, byrow=TRUE),
                              listCov = list(C, C), correlations = 0.1 )

# Building the mfData object
mfD = mfData( grid, Data )


dev.new()
out = multivariate_outliergram(mfD, Fvalue = 2., shift=TRUE)
col_non_outlying = scales::hue_pal( h = c( 180, 270 ),
                                    l = 60 )( N - length( out$ID_outliers ) )
col_non_outlying = set_alpha( col_non_outlying, 0.5 )
col_outlying = scales::hue_pal( h = c( - 90, 180  ),
                                c = 150 )( length( out$ID_outliers ) )
colors = rep('black', N)
colors[out$ID_outliers] = col_outlying
colors[colors == 'black'] = col_non_outlying

lwd = rep(1, N)
lwd[out$ID_outliers] = 2

dev.new()
plot(mfD, col=colors, lwd=lwd)

ntarabelloni/roahd documentation built on Feb. 10, 2022, 1:41 a.m.