DSC_SHC.behavioral: Statistical Hierarchical Clusterer - behavioral construction

Description Usage Arguments Details Methods References Examples

View source: R/SHC_Wrappers.R

Description

The Statistical Hierachical Clusterer class.

Usage

1
2
3
DSC_SHC.behavioral(dimensions,aggloType=AgglomerationType$NormalAgglomeration,
driftType=DriftType$NormalDrift,decaySpeed=10,sharedAgglomerationThreshold=1,
recStats=FALSE,sigmaIndex=FALSE,sigmaIndexNeighborhood=3,sigmaIndexPrecisionSwitch=TRUE)

Arguments

dimensions

(integer) - A number of space dimensions.

aggloType

(list, AgglomerationType) - Agglomeration type: NormalAgglomeration,AggresiveAgglomeration,RelaxedAgglomeration.

driftType

(list, DriftType) - Drift type: NormalDrift,FastDrift,SlowDrift,NoDrift,UltraFastDrift.

decaySpeed

(integer) - Components decay speed. 0 = no decay, >0 = higher number represents slower decay.

sharedAgglomerationThreshold

(integer) - A number of data instances between components that cause their agglomeration under the same cluster.

recStats

(logical) - A flag that indicated whether the SH clusterer should return statistics about the number of components and outliers generated during values processing.

sigmaIndex

(logical) - A flag that indicates whether the SH clusterer should utilize the Sigma-index for speeing up the statistical space query.

sigmaIndexNeighborhood

(integer) - A multiplier for the statistical neighborhood used to manage Sigma-index. This multiplier is used in conjunction with SH clusterer statistical thershold theta - hereby determined by the aggloType parameter.

sigmaIndexPrecisionSwitch

(logical) - A flag that indicates whether the Sigma-index should maintain precision over the speed.

Details

Instantiates an SHC DSC object that represents an extension of abstract classes in the stream package, namely DSC_SinglePass, DSC_Outlier, DSC_Macro, and DSC_R. This object can be used in all stream package methods.

Methods

All methods here are detailed in the stream package.

get_stats(dsc,...)

Returns components and outliers statistics for the last data processing. Only when recStats=T.

get_microclusters(dsc,...)

Returns a list of current component centers.

get_microweights(dsc,...)

Returns the list of current component weights, based on the component population divided by the total population.

get_macroclusters(dsc,...)

Returns a list of current cluster centers.

get_macroweights(dsc,...)

Returns a list of current cluster weights, based on the cluster population (sum of component populations) divided by the total population..

microToMacro(dsc,micro=NULL,...)

Returns a mapping between components and clusters.

get_assignment(dsc, points, type=c("auto", "micro", "macro"), method=c("auto", "model", "nn"), ...)

Returns assignments of points to component and clusters, depending on the type parameter value. Returns a data frame with assignment values, and possibly additional attributes that indicate outliers and their SHC internal identifiers.

get_outlier_positions(dsc, ...)

Returns a list of outliers and their positions.

recheck_outlier_positions(dsc, outlier_correlated_id, ...)

Performs re-checking of the previously encountered outlier. An SHC outlier identifier must be supplied. Function returns TRUE when the outlier still stands (or decayed), or FALSE if it was assimilated by some component in the meantime.

clearEigenMPSupport(dsc, ...)

Clears the OpenMP usage by the Eigen linear algebra package. Introduced only for the reproducibility purposes.

References

[1] Krleža D, Vrdoljak B, and Brčić M, Statistical hierarchical clustering algorithm for outlier detection in evolving data streams, Machine Learning, Sep. 2020

Examples

1
2
3
4
5
6
d <- DSD_Gaussians(k=2,d=2,outliers=2,separation_type="Mahalanobis",separation=4,
                   space_limit=c(0,15),outlier_options=list(outlier_horizon=10000))
c <- DSC_SHC.behavioral(2,driftType = DriftType$NoDrift,decaySpeed = 0)
update(c,d,n=10000)
reset_stream(d)
plot(c,d,n=10000)

dkrleza/SHClus documentation built on Feb. 25, 2021, 10:30 p.m.