new_node: Create A New Node for Split Data Frame

Description Usage Arguments Value References

View source: R/utils.R

Description

This function is just a helper to make sure that the default values of the split data frame is correct when unspecified. It helps reduce type error, especially when moving to use dplyr which is stricter in data types.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
new_node(
  number,
  var,
  cut = -99L,
  n,
  inertia,
  bipartsplitrow = -99L,
  bipartsplitcol = -99L,
  inertiadel = 0,
  inertia_explained = -99,
  medoid,
  loc,
  split.order = -99L,
  alt = list(tibble::tibble(bipartsplitrow = numeric(), bipartsplitcol = numeric()))
)

Arguments

number

Row index of the data frame.

var

Whether it is a leaf, or the name of the next split variable.

cut

The splitting value, so values (of var) smaller than that go to left branch while values greater than that go to right branch.

n

Cluster size. Number of observations in that cluster.

inertia

Inertia value of the cluster at that node.

bipartsplitrow

Position of the next split row in the data set (that position will belong to left node (smaller)).

bipartsplitcol

Position of the next split variable in the data set.

inertiadel

The proportion of inertia value of the cluster at that node to the inertia of the root.

inertia_explained

Percent inertia explained as described in Chavent (2007)

medoid

Position of the data point regarded as the medoid of its cluster.

loc

y-coordinate of the splitting node to facilitate showing on the tree. See plot.MonoClust() for details.

split.order

Order of the splits. Root is 0, and increasing.

alt

Indicator of an alternative cut yielding the same reduction in inertia at that split.

Value

A tibble with only one row and correct default data type for even an unspecified variables.

References


vinhtantran/monoClust documentation built on March 12, 2021, 11:11 p.m.