Performs kmeans clustering on continuous response
measured over time
,
where each mean is defined by a thin plate spline fit to all points in a
cluster. Typically, this function is called by clustra
.
1 2 3 4 5 6 7 8 9  trajectories(
data,
k,
group,
maxdf,
conv = c(10, 0),
mccores = 1,
verbose = FALSE
)

data 
Data table or data frame with response measurements, one per observation.
Column names are 
k 
Number of clusters (groups) 
group 
Vector of initial group numbers corresponding to 
maxdf 
Integer. Basis dimension of smooth term. See 
conv 
A vector of length two, 
mccores 
Integer number of cores to use by 
verbose 
Logical, whether to produce debug output. 
A list with components
deviance
 The final deviance in each cluster added across clusters.
group
 Integer vector of group assignments corresponding to unique id
s.
loss
 Numeric matrix with rows corresponding to unique id
s and one
column for each cluster. Each entry is the mean squared loss for the data in
the id
relative to the cluster model.
k
 An integer giving the requested number of clusters.
k_cl
 An integer giving the converged number of clusters. Can be
smaller than k
when some clusters become too small for degrees of freedom
during convergence.
data_group
 An integer vector, giving group assignment as expanded into
all id
time points.
tps
 A list with k_cl
elements, each an object returned by the
mgcv::bam
fit of a cluster thin plate spline model.
iterations
 An integer giving the number of iterations taken.
counts
 An integer vector giving the number of id
s in each cluster.
counts_df
 An integer vector giving the total number of observations in
each cluster (sum of the number of observations for id
s belonging to the
cluster).
changes
 An integer, giving the number of id
s that changed clusters in
the last iteration. This is zero if converged.
George Ostrouchov and David Gagnon
