View source: R/sequential_transformations.R
seq_transform | R Documentation |
This method finds and classify outliers using sequential transformations proposed in Algorithm 1 of Dai et al. (2020) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2020.106960")}. A sequence of transformations are applied to the functional data and after each transformation, a functional boxplot is applied on the transformed data and outliers flagged by the functional data are noted. A number of transformations mentioned in Dai et al. (2020) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2020.106960")} are supported including vertical alignment ("T1(X)(t)"), normalization ("T2(X)(t)"), one order of differencing ("D1(X)(t)" and "D2(X)(t)") and point-wise outlyingness data ("O(X)(t)"). The feature alignment transformation based on warping/curve registration is not yet supported.
seq_transform(
dts,
sequence = c("T0", "T1", "T2"),
depth_method = c("mbd", "tvd", "extremal", "dirout", "linfinity", "bd", "erld", "dq"),
save_data = FALSE,
emp_factor = 1.5,
central_region = 0.5,
erld_type = NULL,
dq_quantiles = NULL,
n_projections = 200L,
seed = NULL
)
dts |
A matrix for univariate functional data (of size |
sequence |
A character vector usually of length between 1 and 6 containing any of the strings:
Examples of sequences of transformations include: |
depth_method |
A character value specifying depth/outlyingness method to use in the functional boxplot applied after each stage of transformation. Note that the same depth/outlyingness method is used in the functional boxplot applied after each transformation in the sequence. The following methods are currently supported:
|
save_data |
A logical. If TRUE, the intermediate transformed data are returned in a list. |
emp_factor |
The empirical factor for functional boxplot. Defaults to 1.5. |
central_region |
A value between 0 and 1 indicating the central region probability for functional_boxplot. Defaults to 0.5. |
erld_type |
If |
dq_quantiles |
If |
n_projections |
An integer indicating the number of random projections to use in computing the point-wise outlyingness if a 3-d array
is specified in |
seed |
The random seed to set when generating the random directions in the computation of the point-wise outlyingness. Defaults to NULL. in which case a seed is not set. |
This function implements outlier detection using sequential transformations
described in Algorithm 1 of Dai et al. (2020) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2020.106960")}.
A sequence of transformations are applied consecutively with the functional
boxplot applied on the transformed data after each transformation. The following
example sequences (and their meaning) suggested in Dai et al. (2020)
\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2020.106960")} can be parsed to argument sequence
.
"T0"
Apply functional boxplot on raw data (no transformation is applied).
c("T0", "T1", "D1")
Apply functional boxplot on raw data, then apply vertical alignment on data followed by applying functional boxplot again. Finally apply one order of differencing on the vertically aligned data and apply functional boxplot again.
c("T0", "T1", "T2")
Apply functional boxplot on raw data, then apply vertical alignment on data followed by applying functional boxplot again. Finally apply normalization using L-2 norm on the vertically aligned data and apply functional boxplot again.
c("T0", "D1", "D2")
Apply functional boxplot on raw data, then apply one order of difference on data followed by applying
functional boxplot again. Finally apply another one order of differencing on the differenced data and apply functional boxplot again.
Note that this sequence of transformation can also be (alternatively) specified by c("T0", "D1", "D1")
, c("T0", "D2", "D2")
, and
c("T0", "D2", "D1")
since "D1"
and "D2"
do the same thing which is to apply one order lag-1 difference on the data.
"O"
Find the pointwise outlyingness of the multivariate or univariate functional data and then apply functional boxplot on the resulting univariate functional data of pointwise outlyingness. Care must be taken to specify a one sided ordering function (i.e. "one_sided_right" extreme rank length depth) in the functional boxplot used on the data of point-wise outlyingness. This is because only large values should be considered extreme in the data of the point-wise outlyingness.
For multivariate functional data (when a 3-d array is supplied to dts
), the sequence of transformation must always begin with "O"
so that the multivariate data can be replaced with the univariate data of point-wise outlyingness which the functional boxplot can subsequently process
because the functional_boxplot
function only supports univariate functional data.
If repeated transformations are used in the sequence (e.g. when sequence = c("T0", "D1", "D1")
), a warning message is thrown
and the labels of the output list are changed (e.g. for sequence = c("T0", "D1", "D1")
, the labels of the output lists
become "T0", "D1_1", "D1_2"
, so that outliers are accessed with output$outlier$D1_1
and output$outlier$D1_2
).
See examples for more.
A list containing two lists are returned. The contents of the returned list are:
outliers: |
A named list of length |
transformed_data |
If |
# same as running a functional boxplot
dt1 <- simulation_model1()
seqobj <- seq_transform(dt1$data, sequence = "T0", depth_method = "mbd")
seqobj$outliers$T0
functional_boxplot(dt1$data, depth_method = "mbd")$outliers
# more sequences
dt4 <- simulation_model4()
seqobj <- seq_transform(dt4$data, sequence = c("T0", "D1", "D2"), depth_method = "mbd")
seqobj$outliers$T0 # outliers found in raw data
seqobj$outliers$D1 # outliers found after differencing data the first time
seqobj$outliers$D2 # outliers found after differencing the data the second time
# saving transformed data
seqobj <- seq_transform(dt4$data, sequence = c("T0", "D1", "D2"),
depth_method = "mbd", save_data = TRUE)
seqobj$outliers$T0 # outliers found in raw data
head(seqobj$transformed_data$T0) # the raw data
head(seqobj$transformed_data$D1) # the first order differenced data
head(seqobj$transformed_data$D2) # the 2nd order differenced data
# double transforms e.g. c("T0", "D1", "D1")
seqobj <- seq_transform(dt4$data, sequence = c("T0", "D1", "D1"),
depth_method = "mbd", save_data = TRUE) # throws warning
seqobj$outliers$T0 # outliers found in raw data
seqobj$outliers$D1_1 #found after differencing data the first time
seqobj$outliers$D1_2 #found after differencing data the second time
head(seqobj$transformed_data$T0) # the raw data
head(seqobj$transformed_data$D1_1) # the first order differenced data
head(seqobj$transformed_data$D1_2) # the 2nd order differenced data
# multivariate data
dtm <- array(0, dim = c(dim(dt1$data), 2))
dtm[,,1] <- dt1$data
dtm[,,2] <- dt1$data
seqobj <- seq_transform(dtm, sequence = "O", depth_method = "erld",
erld_type = "one_sided_right", save_data = TRUE)
seqobj$outliers$O # multivariate outliers
head(seqobj$transformed_data$O) # univariate outlyingness data
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.