do_clean_multiv: Cleaning multivariate functional outliers

Description Usage Arguments Value Author(s) See Also Examples

View source: R/do_clean_multiv.R

Description

Cleaning of the most remarkable multivariate functional outliers. This improves the performance of the archetypoid algorithm since it is not affected by spurious points.

Usage

1
do_clean_multiv(data, num_pts, range = 1.5, out_perc = 80, nbasis, nvars)

Arguments

data

Data frame with (temporal) points in the rows and observations in the columns.

num_pts

Number of temporal points.

range

Same parameter as in function boxplot. A value of 1.5 is enough to detect amplitude and shift outliers, while a value of 3 is needed to detect isolated outliers.

out_perc

Minimum number of temporal points (in percentage) to consider the observation as an outlier. Needed when range=1.5.

nbasis

Number of basis.

nvars

Number of variables.

Value

List with the outliers for each variable.

Author(s)

Irene Epifanio

See Also

boxplot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
## Not run: 
library(fda)
?growth
str(growth)
hgtm <- growth$hgtm
hgtf <- growth$hgtf[,1:39]

# Create array:
nvars <- 2
data.array <- array(0, dim = c(dim(hgtm), nvars))
data.array[,,1] <- as.matrix(hgtm)
data.array[,,2] <- as.matrix(hgtf)
rownames(data.array) <- 1:nrow(hgtm)
colnames(data.array) <- colnames(hgtm)
str(data.array)

# Create basis:
nbasis <- 10
basis_fd <- create.bspline.basis(c(1,nrow(hgtm)), nbasis)
PM <- eval.penalty(basis_fd)
# Make fd object:
temp_points <- 1:nrow(hgtm)
temp_fd <- Data2fd(argvals = temp_points, y = data.array, basisobj = basis_fd)

X <- array(0, dim = c(dim(t(temp_fd$coefs[,,1])), nvars))
X[,,1] <- t(temp_fd$coef[,,1]) 
X[,,2] <- t(temp_fd$coef[,,2])

# Standardize the variables:
Xs <- X
Xs[,,1] <- scale(X[,,1])
Xs[,,2] <- scale(X[,,2])

x1 <- t(Xs[,,1]) 
for (i in 2:nvars) { 
 x12 <- t(Xs[,,i]) 
 x1 <- rbind(x1, x12) 
}
data_all <- t(x1) 

num_pts <- ncol(data_all) / nvars
range <- 3 
outl <- do_clean_multiv(t(data_all), num_pts, range, out_perc, nbasis, nvars)
outl

## End(Not run)
                  

adamethods documentation built on Aug. 4, 2020, 5:08 p.m.