wkmeans: Weighted k-means for mixed-type data
In kamila: Methods for Clustering Mixed-Type Data

Description Usage Arguments Details Value See Also Examples

Weighted k-means for mixed continuous and categorical variables. A user-specified weight conWeight controls the relative contribution of the variable types to the cluster solution.

1	wkmeans(conData, catData, conWeight, nclust, ...)

`conData`	The continuous variables. Must be coercible to a data frame.
`catData`	The categorical variables, either as factors or dummy-coded variables. Must be coercible to a data frame.
`conWeight`	The continuous weight; must be between 0 and 1. The categorical weight is `1-conWeight`.
`nclust`	The number of clusters.
`...`	Optional arguments passed to `kmeans`.

A simple adaptation of stats::kmeans to mixed-type data. Continuous variables are multiplied by the input parameter conWeight, and categorical variables are multipled by 1-conWeight. If factor variables are input to catData, they are transformed to 0-1 dummy coded variables with the function dummyCodeFactorDf.

A stats::kmeans results object, with additional slots conCenters and catCenters giving the actual centers adjusted for the weighting process.

dummyCodeFactorDf

kmeans

# Generate toy data set with poor quality categorical variables and good
# quality continuous variables.
set.seed(1)
dat <- genMixedData(200, nConVar=2, nCatVar=2, nCatLevels=4, nConWithErr=2,
  nCatWithErr=2, popProportions=c(.5,.5), conErrLev=0.3, catErrLev=0.8)
catDf <- data.frame(apply(dat$catVars, 2, factor), stringsAsFactors = TRUE)
conDf <- data.frame(scale(dat$conVars), stringsAsFactors = TRUE)

# A clustering that emphasizes the continuous variables
r1 <- with(dat,wkmeans(conDf, catDf, 0.9, 2))
table(r1$cluster, dat$trueID)

# A clustering that emphasizes the categorical variables; note argument
# passed to the underlying stats::kmeans function
r2 <- with(dat,wkmeans(conDf, catDf, 0.1, 2, nstart=4))
table(r2$cluster, dat$trueID)

kamila documentation built on March 13, 2020, 9:08 a.m.

kamila index

Package overview README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kamila
Methods for Clustering Mixed-Type Data

wkmeans: Weighted k-means for mixed-type data
In kamila: Methods for Clustering Mixed-Type Data

Description

Usage

Arguments

Details

Value

See Also

Examples

Example output

Related to wkmeans in kamila...

R Package Documentation

Browse R Packages

We want your feedback!

kamila Methods for Clustering Mixed-Type Data

wkmeans: Weighted k-means for mixed-type data In kamila: Methods for Clustering Mixed-Type Data

Description

Usage

Arguments

Details

Value

See Also

Examples

Example output

Related to wkmeans in kamila...

R Package Documentation

Browse R Packages

We want your feedback!

kamila
Methods for Clustering Mixed-Type Data

wkmeans: Weighted k-means for mixed-type data
In kamila: Methods for Clustering Mixed-Type Data