clustra: Cluster longitudinal trajectories over time

Description Usage Arguments Value Examples

View source: R/trajectories.R

Description

The usual top level function for clustering longitudinal trajectories. After initial setup, it calls trajectories to perform k-means clustering on continuous response measured over time, where each mean is defined by a thin plate spline fit to all points in a cluster. See clustra_vignette.Rmd for examples of use.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
clustra(
  data,
  k,
  starts = c(1, 0),
  group = NULL,
  maxdf = 30,
  conv = c(10, 0),
  mccores = 1,
  verbose = FALSE
)

Arguments

data

Data frame or, preferably, also a data.table with response measurements, one response per observation. Required variables are (id, time, response). Other variables are ignored.

k

Number of clusters

starts

A vector of length two. See start_groups.

group

A vector of initial cluster assignments for unique id's in data. Normally, this is NULL and good starts are provided by start_groups.

maxdf

Fitting parameters. See trajectories.

conv

Fitting parameters. See trajectories.

mccores

See trajectories.

verbose

Logical to turn on more output during fit iterations.

Value

A list returned by trajectories plus one more element ido, giving the original id numbers.

Examples

1
2
3
4
5
6
set.seed(13)
data = gen_traj_data(n_id = c(50, 100), m_obs = 20, s_range = c(-365, -14),
              e_range = c(0.5*365, 2*365))
cl = clustra(data, k = 2, maxdf = 20, conv = c(5, 0), verbose = TRUE)
tabulate(data$group)
tabulate(data$true_group)

clustra documentation built on Jan. 16, 2022, 9:06 a.m.