Description Usage Arguments Value Adjustment References See Also Examples
This function performs the outliergram of a univariate functional data set, possibly with an adjustment of the true positive rate of outliers discovered under assumption of gaussianity.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
fData |
the univariate functional dataset whose outliergram has to be determined. |
MBD_data |
a vector containing the MBD for each element of the dataset. If missing, MBDs are computed. |
MEI_data |
a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed. |
p_check |
percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset). |
Fvalue |
the F value to be used in the procedure that finds the
shape outliers by looking at the lower parabolic limit in the outliergram.
Default is |
adjust |
either
|
display |
either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed. |
xlab |
a list of two labels to use on the x axis when displaying the functional dataset and the outliergram |
ylab |
a list of two labels to use on the y axis when displaying the functional dataset and the outliergram; |
main |
a list of two titles to be used on the plot of the functional dataset and the outliergram; |
... |
additional graphical parameters to be used only in the plot of the functional dataset |
Even when used graphically to plot the outliergram, the function returns a list containing:
Fvalue
: the value of the parameter F used;
d
: the vector of values of the parameter d for each observation
(distance to the parabolic border of the outliergram);
ID_outliers
: the vector of observations id corresponding to outliers.
When the adjustment option is selected, the value of F is optimized for
the univariate functional dataset provided with fData
. In practice,
a number adjust$N_trials
of times a synthetic population
(of size adjust$trial_size
with the same covariance (robustly
estimated from data) and centerline as fData
is simulated without
outliers and each time an optimized value F_i is computed so that a
given proportion (adjust$TPR
) of observations is flagged as outliers.
The final value of F
for the outliergram is determined as an average
of F_1, F_2, …, F_{N_{trials}}. At each time step the optimization
problem is solved using stats::uniroot
(Brent's method).
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | set.seed(1618)
N <- 200
P <- 200
N_extra <- 4
grid <- seq(0, 1, length.out = P)
Cov <- exp_cov_function(grid, alpha = 0.2, beta = 0.8)
Data <- generate_gauss_fdata(
N = N,
centerline = sin(4 * pi * grid),
Cov = Cov
)
Data_extra <- array(0, dim = c(N_extra, P))
Data_extra[1, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid + pi / 2),
Cov = Cov
)
Data_extra[2, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid - pi / 2),
Cov = Cov
)
Data_extra[3, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid + pi / 3),
Cov = Cov
)
Data_extra[4, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid - pi / 3),
Cov = Cov
)
Data <- rbind(Data, Data_extra)
fD <- fData(grid, Data)
# Outliergram with default Fvalue = 1.5
outliergram(fD, display = TRUE)
# Outliergram with Fvalue enforced to 2.5
outliergram(fD, Fvalue = 2.5, display = TRUE)
# Outliergram with estimated Fvalue to ensure TPR of 1%
outliergram(
fData = fD,
adjust = list(
N_trials = 10,
trial_size = 5 * nrow(Data),
TPR = 0.01,
VERBOSE = FALSE
),
display = TRUE
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.