multivariate_outliergram: Outliergram for multivariate functional datasets
In ntarabelloni/roahd: Robust Analysis of High Dimensional Data

Description Usage Arguments Details Adjustment References See Also Examples

This function performs the outliergram of a multivariate functional dataset.

multivariate_outliergram(
  mfData,
  MBD_data = NULL,
  MEI_data = NULL,
  weights = "uniform",
  p_check = 0.05,
  Fvalue = 1.5,
  shift = TRUE,
  display = TRUE,
  xlab = NULL,
  ylab = NULL,
  main = NULL
)

`mfData`	the multivariate functional dataset whose outliergram has to be determined;
`MBD_data`	a vector containing the MBD for each element of the dataset; If missing, MBDs are computed with the specified choice of weights;
`MEI_data`	a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed;
`weights`	the weights choice to be used to compute multivariate MBDs and MEIs;
`p_check`	percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset).
`Fvalue`	the F value to be used in the procedure that finds the shape outliers by looking at the lower parabolic limit in the outliergram. Default is `1.5`;
`shift`	whether to apply the shifting algorithm to properly manage observations having low or high MEI. Default is TRUE.
`display`	either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed;
`xlab`	the label to use on the x axis in the outliergram plot;
`ylab`	the label to use on the x axis in the outliergram plot;
`main`	the title to use in the outliergram;

The method applies the extension of the univariate outliergram to the case of multivariate functional datasets. Differently from the function for the univariate case, only the outliergram plot is displayed.

Differently from the case of univariate functional data, in this case the function does not apply an automatic tuning of the F parameter, since the related procedure would become computationally too heavy for general datasets. If a good value of F is sought, it is recommended to run several trials of the outliergram and manually select the best value.

Ieva, F. & Paganoni, A.M. Stat Papers (2017). https://doi.org/10.1007/s00362-017-0953-1.

outliergram, mfData, MBD, MEI

N = 2e2
P = 1e2

t0 = 0
t1 = 1

set.seed(1)

# Defining the measurement grid
grid = seq( t0, t1, length.out = P )

# Generating an exponential covariance matrix to be used in the simulation of
# the functional datasets (see the related help for details)
C = exp_cov_function( grid, alpha = 0.3, beta = 0.2)

# Simulating the measurements of two univariate functional datasets with
# required center and covariance function
f1 = function(x) x * ( 1 - x )
f2 = function(x) x^3
Data = generate_gauss_mfdata( N, L = 2,
                              centerline = matrix(c(sin(2 * pi * grid),
                                                    cos(2 * pi * grid)), nrow=2, byrow=TRUE),
                              listCov = list(C, C), correlations = 0.1 )

# Building the mfData object
mfD = mfData( grid, Data )


dev.new()
out = multivariate_outliergram(mfD, Fvalue = 2., shift=TRUE)
col_non_outlying = scales::hue_pal( h = c( 180, 270 ),
                                    l = 60 )( N - length( out$ID_outliers ) )
col_non_outlying = set_alpha( col_non_outlying, 0.5 )
col_outlying = scales::hue_pal( h = c( - 90, 180  ),
                                c = 150 )( length( out$ID_outliers ) )
colors = rep('black', N)
colors[out$ID_outliers] = col_outlying
colors[colors == 'black'] = col_non_outlying

lwd = rep(1, N)
lwd[out$ID_outliers] = 2

dev.new()
plot(mfD, col=colors, lwd=lwd)