start_groups: Function to assign starting groups.
In clustra: Clustering Longitudinal Trajectories

start_groups

R Documentation

Function to assign starting groups.

Description

Either a random assignment of k approximately equal size clusters or a FastMap-like algorithm that sequentially selects k distant ids from those that have more than the median number of observations. TPS fits to these ids are used as cluster centers for a starting group assignment. A user supplied starting assignment is also possible.

Usage

start_groups(k, data, starts, maxdf, conv, mccores = 1, verbose = FALSE)

Arguments

`k`	Number of clusters (groups).
`data`	Data.table with response measurements, one per observation. Column names are id, time, response, group. Note that `id`s are assumed sequential starting from 1. This affects expanding group numbers to ids.
`starts`	Type of start groups generated. See `clustra`.
`maxdf`	Fitting parameters. See `trajectories`.
`conv`	Fitting parameters. See `trajectories`.
`mccores`	See `trajectories`.
`verbose`	Turn on more output for debugging. Values 0, 1, 2, 3 add more output. 2 and 3 produce graphs during iterations - use carefully!

Value

An integer vector corresponding to unique ids, giving group number assignments.

For distant, each sequential selection takes an id that has the largest minimum distance from smooth TPS fits (<= 5 deg) of previous selections. The distance of an id to a single TPS is the median absolute error across the id time points. Distance of an id to a set of TPS is the minimum of the individual distances. We pick the id that has the maximum of such a minimum of medians.

clustra documentation built on May 29, 2024, 4:25 a.m.