clustEff: Cluster Effects Algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function implements the algorithm to cluster curves of effects obtained from a quantile regression (qrcm; Frumento and Bottai, 2015) in which the coefficients are described by flexible parametric functions of the order of the quantile. This algorithm can be also used for clustering of curves observed in time, as in functional data analysis.

Usage

1
2
3
4
5
clustEff(Beta, p, alpha, k, ask=FALSE, k.min=1, k.max=min(10, (ncol(Beta)-1)),
        cluster.effects=TRUE, Beta.lower=NULL, Beta.upper=NULL,
        step=c("both", "shape", "distance"), plot=TRUE,
        method=c("ward.D", "ward.D2", "single", "complete", "average",
                 "mcquitty", "median", "centroid"))

Arguments

Beta

A matrix n x q. q represents the number of curves to cluster and n is either the length of percentiles used in the quantile regression or the length of the time vector.

p

The percentiles used in the quantile regression or the vector of time.

alpha

It is the alpha-percentile used for computing the dissimilarity matrix. If not fixed, the algorithm choose alpha=.25 (cluster.effects=TRUE) or alpha=.5 (cluster.effects=FALSE).

k

If fixed, it represents the number of clusters.

ask

If TRUE, after plotting the dendrogram, the user make is own choice about how many cluster to use.

k.min

The minimum number of clusters to let the algorithm to choose the best.

k.max

The maximum number of clusters to let the algorithm to choose the best.

cluster.effects

If TRUE, it selects the framework (quantile regression or curves clustering) in which to apply the clustering algorithm.

Beta.lower

A matrix n x q. q represents the number of lower interval of the curves to cluster and n the length of percentiles used in quantile regression. Used only if cluster.effects=TRUE.

Beta.upper

A matrix n x q. q represents the number of upper interval of the curves to cluster and n the length of percentiles used in quantile regression. Used only if cluster.effects=TRUE.

step

The steps used in computing the dissimilarity matrix. Default is "both"=("shape" and "distance")

plot

If TRUE, dendrogram, boxplot and clusters are plotted.

method

The agglomeration method to be used.

Details

Quantile regression models conditional quantiles of a response variabile, given a set of covariates. Assume that each coefficient can be expressed as a parametric function of p in the form:

β(p | θ) = θ0 + θ1*b1(p) + θ2*b2(p) + …

where b1(p), b2(p), … are known functions of p.

Value

An object of class “clustEff”, a list containing the following items:

call

the matched call.

X

The curves matrix.

X.mean

The mean curves matrix of dimension n x k.

X.mean.dist

The within cluster distance from the mean curve.

X.lower

The lower interval matrix.

X.mean.lower

The mean lower interval of dimension n x k.

X.upper

The upper interval matrix.

X.mean.upper

The mean upper interval of dimension n x k.

k

The number of selected clusters.

p

The percentiles used in quantile regression coefficient modeling or the time otherwise.

diss.matrix

The dissimilarity matrix.

oggSilhouette

An object of class “silhouette”.

oggHclust

An object of class “hclust”.

clusters

The vector of clusters.

alpha.dist

The vector of alpha-distances corresponding to the alpha-percentile of the distances along the percentiles.

distance

A vector of goodness measures used to select the best number of clusters.

step

The selected step.

method

The used agglomeration method.

cut.method

The used method to select the best number of clusters.

alpha

The selected alpha-percentile.

Author(s)

Gianluca Sottile gianluca.sottile@unipa.it

References

Sottile, G and Adelfio, G (2017). Clustering of effects through quantile regression. Proceedings: International Workshop of Statistical Modeling.

Frumento, P., and Bottai, M. (2015). Parametric modeling of quantile regression coefficient functions. Biometrics, doi: 10.1111/biom.12410.

See Also

summary.clustEff, plot.clustEff, for summary and plotting. extract.object to extract useful objects for the clustering algorithm through a quantile regression coefficient modeling in a multivariate case.

Examples

1
2
3
  ##### Using simulated data in all examples

  # see the documentation for 'clustEff-package'

gianluca-sottile/clustEff documentation built on May 8, 2019, 9:22 a.m.