lambdaest: Compares Variability of Variables
In clustMixType: k-Prototypes Clustering for Mixed Variable-Type Data

View source: R/kprototypes.R

lambdaest

R Documentation

Compares Variability of Variables

Description

Investigation of the variables' variances/concentrations to support specification of lambda for k-prototypes clustering.

Usage

lambdaest(
  x,
  num.method = 1,
  fac.method = 1,
  outtype = "numeric",
  verbose = TRUE
)

Arguments

`x`	Data.frame with both numerics and factors.
`num.method`	Integer 1 or 2. Specifies the heuristic used for numeric variables.
`fac.method`	Integer 1 or 2. Specifies the heuristic used for factor variables.
`outtype`	Specifies the desired output: either 'numeric', 'vector' or 'variation'.
`verbose`	Logical whether additional information about process should be printed.

Details

Variance (num.method = 1) or standard deviation (num.method = 2) of numeric variables and 1-\sum_i p_i^2 (fac.method = 1) or 1-\max_i p_i (fac.method = 2) for factors is computed.

Value

lambda

Ratio of averages over all numeric/factor variables is returned. In case of outtype = "vector" the separate lambda for all variables is returned as the inverse of the single variables' variation as specified by the num.method and fac.method argument. outtype = "variation" directly returns these quantities and is not meant to be passed directly to kproto().

Author(s)

gero.szepannek@web.de

Examples

# generate toy data with factors and numerics

n   <- 100
prb <- 0.9
muk <- 1.5 
clusid <- rep(1:4, each = n)

x1 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x1 <- c(x1, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x1 <- as.factor(x1)

x2 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x2 <- c(x2, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x2 <- as.factor(x2)

x3 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))
x4 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))

x <- data.frame(x1,x2,x3,x4)

lambdaest(x)
res <- kproto(x, 4, lambda = lambdaest(x))

clustMixType documentation built on July 1, 2024, 5:08 p.m.