Description Details Author(s) References Examples
This package implements a general algorithm to cluster curves of effects obtained from a quantile regression (qrcm; Frumento and Bottai, 2015) in which the coefficients are described by flexible parametric functions of the order of the quantile. This algorithm can be also used for clustering of curves observed in time, as in functional data analysis.
Package: | clustEff |
Type: | Package |
Version: | 1.0 |
Date: | 2017-03-06 |
License: | GPL-2 |
The function clustEff
allows to specify the type of the curves to apply the proposed clustering algorithm. The function extract.object
extracts the matrices, in case of multivariate response, through the quantile regression coefficient modeling, useful to run the main algorithm. The auxiliary functions summary.clustEff
and plot.clustEff
can be used to extract information from the main algorithm.
Gianluca Sottile
Maintainer: Gianluca Sottile <gianluca.sottile@unipa.it>
Sottile, G and Adelfio, G (2017). Clustering of effects through quantile regression. Proceedings International Workshop of Statistical Modeling.
Frumento, P., and Bottai, M. (2015). Parametric modeling of quantile regression coefficient functions. Biometrics, doi: 10.1111/biom.12410.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | # use simulated data
# CURVES EFFECTS CLUSTERING
set.seed(1234)
n <- 1000
q <- 2
k <- 5
x1 <- runif(n, 0, 5)
x2 <- runif(n, 0, 5)
X <- cbind(x1, x2)
rownames(X) <- 1:n
colnames(X) <- paste0("X", 1:q)
theta1 <- matrix(c(1, 1, 0, 0, 0, .5, 0, .5, 1, 2, .5, 0, 2, 1, .5),
ncol=k, byrow=TRUE)
theta2 <- matrix(c(1, 1, 0, 0, 0, -.3, 0, .5, 1, .5, -1.5, 0, -1, -.5, 1),
ncol=k, byrow=TRUE)
theta3 <- matrix(c(1, 1, 0, 0, 0, .3, 0, -.5, -1, 2, -.5, 0, 1, -.5, -1),
ncol=k, byrow=TRUE)
rownames(theta3) <- rownames(theta2) <- rownames(theta1) <-
c("(intercept)", paste("X", 1:q, sep=""))
colnames(theta3) <- colnames(theta2) <- colnames(theta1) <-
c("(intercept)", "qnorm(p)", "p", "p^2", "p^3")
Theta <- list(theta1, theta2, theta3)
B <- function(p, k){matrix(cbind(1, qnorm(p), p, p^2, p^3), nrow=k, byrow=TRUE)}
Q <- function(p, theta, B, k, X){rowSums(X * t(theta %*% B(p, k)))}
s <- matrix(1, q+1, k)
s[2:(q+1), 2] <- 0
s[1, 3:k] <- 0
Y <- matrix(NA, nrow(X), 15)
for(i in 1:15){
if(i <= 5) Y[, i] <- Q(runif(n), Theta[[1]], B, k, cbind(1, X))
if(i <= 10 & i > 5) Y[, i] <- Q(runif(n), Theta[[2]], B, k, cbind(1, X))
if(i <= 15 & i > 10) Y[, i] <- Q(runif(n), Theta[[3]], B, k, cbind(1, X))
}
XX <- extract.object(Y, X, intercept=TRUE, formula.p= ~ I(p) + I(p^2) + I(p^3))
seqP <- XX$p
obj <- clustEff(XX$X$X1, seqP, Beta.lower=XX$Xl$X1, Beta.upper=XX$Xr$X1)
summary(obj)
plot(obj, xvar="clusters", add=TRUE)
plot(obj, xvar="clusters", add=FALSE)
plot(obj, xvar="dendrogram")
plot(obj, xvar="boxplot")
# CURVES CLUSTERING IN FUNCTIONAL DATA ANALYSIS
set.seed(1234)
n <- 1000
x <- 1:n/n
Y <- matrix(0, n, 30)
sigma2 <- 4*pmax(x-.2, 0) - 8*pmax(x-.5, 0) + 4*pmax(x-.8, 0)
mu <- sin(3*pi*x)
for(i in 1:10) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))
mu <- cos(3*pi*x)
for(i in 11:23) Y[,i] <- mu + rnorm(length(x), 0, pmax(sigma2,0))
mu <- sin(3*pi*x)*cos(pi*x)
for(i in 24:28) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))
mu <- 0 #sin(1/3*pi*x)*cos(2*pi*x)
for(i in 29:30) Y[, i] <- mu + rnorm(length(x), 0, pmax(sigma2, 0))
obj2 <- clustEff(Y, x, cluster.effects=FALSE)
summary(obj2)
plot(obj2, xvar="clusters", add=TRUE)
plot(obj2, xvar="clusters", add=FALSE)
plot(obj2, xvar="dendrogram")
plot(obj2, xvar="boxplot")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.