makeClusterTask: Creates a ClusterTask Object

Description Usage Arguments Value Examples

Description

A Task encapsulates the Data with some additional information.
As of now clustering will be only made for numerical data. Categorical data is [WIP]

Usage

1
2
3
makeClusterTask(id, data, cluster.cols = NULL, method = "cluster.kmeans",
  random.seed = set.seed(Sys.time()), scale.num.data = TRUE,
  par.vals = list(), show.NA.msg = FALSE, ...)

Arguments

id

[character(1)]
ID of the Task Object

data

[data.frame]
A Dataframe with different variables

cluster.cols

[character()]
Named character vector to specify clusters. This only holds for
not hierarchical cluster methods. Default cluster.cols = NULL
. In default mode only datasets with maximal 10 numeric columns, a
cluster analysis will be with the combinations: choose(5,2).
If the amount of numeric columns is above 10, only the cluster for the PCA
for the numeric columns will be calculated and out ouf (k,2) combinations
randomly selected 10 combinations of tuples

method

[character(1)]
Defines the clustering method Possible choices are:
For Hierarchical Clustering:

  • cluster.h - for more information @seealso hclust

  • cluster.agnes - for more information @seealso agnes

  • cluster.diana - for more information @seealso diana

For Partitioning Clustering:

  • cluster.kmeans - for more information @seealso kmeans

  • cluster.kkmeans - for more information @seealso kkmeans

  • cluster.pam - for more information @seealso pam

For Model-Based Clustering:

  • cluster.dbscan - for more information @seealso dbscan

  • cluster.mod - for more information @seealso Mclust

Default is method = "cluster.kmeans"

random.seed

[integer(1)]
Default is random.seed = set.seed(Sys.time())

scale.num.data

[logical(1L)]
Logical whether to scale numeric data or not.
Default is scale= TRUE

par.vals

[list]
Additional arguments handled over to cluster algorithm method.
Default is empty list par.vals = list()

show.NA.msg

[logical(1)]
Logical whether to show missing values message
Default is FALSE.

...

For now has no use

Value

ClusterTask Object

Examples

1
2
3
4
5
6
7
cluster.task = makeClusterTask(id = "iris", data = iris,
  method = "cluster.kmeans",
  random.seed = 89L, par.vals = list(iter.max = 15L))
cluster.task2 = makeClusterTask(id = "iris", data = iris,
  method = "cluster.kmeans", random.seed = 89L,
  cluster.cols = c("Sepal.Length" = "Petal.Length",
  "Sepal.Width" = "Petal.Width"))

ptl93/AEDA documentation built on May 7, 2019, 3:20 p.m.