Description Usage Arguments Details Value See Also Examples
'kmlCov' re-launch the algorithm implemented in glmClust, for clustering longitudinal data (trajectories), several times with different starting conditions and various number of clusters.
1 2 3 4 |
formula |
A symbolic description of the model. In the parametric case we write for example 'y ~ clust(time+time2) + pop(sex)', here 'time' and 'time2' will have a different effect according to the cluster, the 'sex' effect is the same for all the clusters. In the non-parametric case only one covariate is allowed. |
data |
A [data.frame] in long format (no missing values) which means that each line corresponds to one measure of the observed phenomenon, and one individual may have multiple measures (lines) identified by an identity column. In the non-parametric case the totality of patients must have all the measurements at all fixed times. |
nClust |
The number of clusters, at leas 2 an at most 26. |
nRedraw |
The number of time the algorithm is re-run with different starting conditions. |
ident |
The name of the column identity. |
timeVar |
Specify the column name of the time variable. |
family |
A description of the error distribution and link function to be used in the model, by default 'gaussian'. This can be a character string naming a family function, a family function or the result of a call to a family function. (See 'family' for details of family functions). |
effectVar |
An effect, can be a level cluster effect or not. |
weights |
Vector of 'prior weights' to be used in the fitting process, by default the weights are equal to one. |
timeParametric |
By default [TRUE] thus parametric on the time. If [FALSE] then only one covariate is allowed in the formula and the algorithm used is the k-means. |
separateSampling |
By default [TRUE] it means that the proportions of the clusters are supposed equal in the classification step, the log-likelihood maximised at each step of the algorithm is ∑_{k=1}^{K}∑_{y_i \in P_k} \log(f(y_i, θ_k)), otherwise the proportions of clusters are taken into account and the log-likelihood is ∑_{k=1}^{K}∑_{y_i \in P_k} \log(λ_{k}f(y_i, θ_k)). |
max_itr |
The maximum number of iterations fixed at 100. |
verbose |
Print the output in the console. |
The purpose of kmlCov
is clustering longitudinal
data, as well as glmClust, and automate the
procedure of re-launching the algorithm from different
starting conditions by specifying nRedraw
.
The algorithm depends greatly of the starting conditions
(initial affectation on the trajectories/individuals), so
it is recommanded to run the algorithm multiple times in
order to explore the space of the solutions.
'kmlCov' return a list of list of GlmCLuster
, the
partitions are compared using as criterion the
classification log-likelihood, the higher are the
best partitions.
A an object of class KmlCovList
.
glmClust
which_best
1 2 3 |
2 Clusters : Running ..End
3 Clusters : Running ..End
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.