skm_mls: skm_mls

Description Usage Arguments Details Value

Description

a selective k-means problem solver - wrapper over skm_mls_cpp

Usage

1
2
3
4
skm_mls(x, k = 1L, s_colname = "s", t_colname = "t", d_colname = "d",
  w_colname = NULL, s_ggrp = integer(0L), s_must = integer(0L),
  max_it = 100L, max_at = 100L, auto_create_ggrp = TRUE,
  extra_immaculatism = TRUE, extra_at = 10L)

Arguments

x

data.table with s - t - d(s, t): s<source> - t<target> - d<distance> where s<source> and t<target> must characters and d<distance> must numeric. aware d<distance> is not necessary as an euclidean or any distance and even necessary as symmetric - d(s, t) can be unequal to d(t, s) - view d as such a measure of the cost of assigning one to the other!

k

number of centers

s_colname

s<source>

t_colname

t<target>

d_colname

d<distance> - view d as cost of assigning t into s. also modify the input data or build in the algorithm can solve problem with a different fixed cost on using each s as source - i prefer to moddify data so that the algorithm is clean and clear - i will show a how to in vignette

w_colname

w<weighting> - optional: when not null will optimize toward objective to minimize d = d * w such as weighted cost of assigning t into s

s_ggrp

s_init will be stratified sampling from s w.r.t s_ggrp.

s_must

length <= k-1 s must in result: conditional optimizing.

max_it

max number of iterations can run for optimizing result.

max_at

max number of attempts/repeats on running for optimial.

auto_create_ggrp

boolean indicator of whether auto creating the group structure using the first letter of s when s_ggrp is integer(0).

extra_immaculatism

boolean indicator of whether making extra runs for improving result consistency when multiple successive k is specified, e.g., k = c(9L, 10L).

extra_at

an integer specifying the number of extra runs when argument extra_immaculatism is TRUE.

Details

a selective k-means problem is defined as finding a subset of k rows from a m x n matrix such that the sum of each column minimial is minimized.

skm_mls would take data.table (data.frame) as inputs, rather than a matrix, assume that a data.table of s - t - d(s, t) for all combination of s and t, choose k of s that minimizes sum(min(d(s, t) over selected k of s) over t).

Value

data.table

o - objective - based on d_colname

w - weighting - based on w_colname

k - k<k-list> - based on k - input

s - s<source> - based on s_colname

d - weighed averge value of d_colname weighed by w_column when s are selected.


skm documentation built on May 1, 2019, 10:10 p.m.