ClusterApply: Applies a function over grouped data

View source: R/ClusterApply.R

ClusterApplyR Documentation

Applies a function over grouped data

Description

Applies a given function to each dimension d of data separately for each cluster

Usage

ClusterApply(DataOrDistances,FUN,Cls,Simple=FALSE,...)

Arguments

DataOrDistances

[1:n,1:d] with: if d=n and symmetric then distance matrix assumed, otherwise:

[1:n,1:d] matrix of defining the dataset that consists of n cases or d-dimensional data points. Every case has d attributes, variables or features.

FUN

Function to be applied to each cluster of data and each column of data

Cls

[1:n] numerical vector with n numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers representing the arbitrary labels of the clustering.

Simple

Boolean, if TRUE, simplifies output

...

Additional parameters to be passed on to FUN

Details

Applies a given function to each feature of each cluster of data using the clustering stored in Cls which is the cluster identifiers for all rows in data. If missing, all data are in first cluster, The main output is FUNPerCluster[i] which is the result of FUN for the data points in cluster of UniqueClusters[i] named with the function's name used.

In case of a distance matrix an automatic classical multidimensional scaling transformation of distances to data is computed. Number of dimensions is selected by the minimal stress w.r.t. the possible output dimensions of cmdscale.

If FUN has not function name, then ResultPerCluster is given back.

Value

if(Simple==FALSE) List with

UniqueClusters

The unique clusters in Cls

FUNPerCluster

a matrix of [1:k,1:d] of d features and k clusters, the list element is named by the function FUN used

if(Simple==TRUE)

a matrix of [1:k,1:d] of d features and k clusters

Author(s)

Felix Pape, Michael Thrun

Examples

##one dataset
data(Hepta)
Data=Hepta$Data
Cls=Hepta$Cls
#mean per cluster
ClusterApply(Data,mean,Cls)

#Simplified
ClusterApply(Data,mean,Cls,Simple=TRUE)

# Mean per cluster of MDS transformation
# Beware, this is not the same!

ClusterApply(as.matrix(dist(Data)),mean,Cls)


## Not run: 
Iris=datasets::iris
Distances=as.matrix(Iris[,1:4])
SomeFactors=Iris$Species
V=ClusterCreateClassification(SomeFactors)
Cls=V$Cls
V$ClusterNames
ClusterApply(Distances,mean,Cls)

## End(Not run)
#special case of identity
## Not run: 
suppressPackageStartupMessages(library('prabclus',quietly = TRUE))
data(tetragonula)
#Generated Specific Distance Matrix
ta <- alleleconvert(strmatrix=as.matrix(tetragonula[1:236,]))
tai <- alleleinit(allelematrix=ta,distance="none")
Distance=alleledist((unbuild.charmatrix(tai$charmatrix,236,13)),236,13)

MDStrans=ClusterApply(Distance,identity)$identityPerCluster

## End(Not run)

Mthrun/FCPS documentation built on June 28, 2023, 9:29 a.m.