# Clustering longitudinal data from different starting conditions

### Description

'kmlCov' re-launch the algorithm implemented in glmClust, for clustering longitudinal data (trajectories), several times with different starting conditions and various number of clusters.

### Usage

1 2 3 4 |

### Arguments

`formula` |
A symbolic description of the model. In the parametric case we write for example 'y ~ clust(time+time2) + pop(sex)', here 'time' and 'time2' will have a different effect according to the cluster, the 'sex' effect is the same for all the clusters. In the non-parametric case only one covariate is allowed. |

`data` |
A [data.frame] in long format (no missing values) which means that each line corresponds to one measure of the observed phenomenon, and one individual may have multiple measures (lines) identified by an identity column. In the non-parametric case the totality of patients must have all the measurements at all fixed times. |

`nClust` |
The number of clusters, at leas 2 an at most 26. |

`nRedraw` |
The number of time the algorithm is re-run with different starting conditions. |

`ident` |
The name of the column identity. |

`timeVar` |
Specify the column name of the time variable. |

`family` |
A description of the error distribution and link function to be used in the model, by default 'gaussian'. This can be a character string naming a family function, a family function or the result of a call to a family function. (See 'family' for details of family functions). |

`effectVar` |
An effect, can be a level cluster effect or not. |

`weights` |
Vector of 'prior weights' to be used in the fitting process, by default the weights are equal to one. |

`timeParametric` |
By default [TRUE] thus parametric on the time. If [FALSE] then only one covariate is allowed in the formula and the algorithm used is the k-means. |

`separateSampling` |
By default [TRUE] it means that
the proportions of the clusters are supposed equal in the
classification step, the log-likelihood maximised at each
step of the algorithm is |

`max_itr` |
The maximum number of iterations fixed at 100. |

`verbose` |
Print the output in the console. |

### Details

The purpose of `kmlCov`

is clustering longitudinal
data, as well as glmClust, and automate the
procedure of re-launching the algorithm from different
starting conditions by specifying `nRedraw`

.

The algorithm depends greatly of the starting conditions
(initial affectation on the trajectories/individuals), so
it is recommanded to run the algorithm multiple times in
order to explore the space of the solutions.

'kmlCov' return a list of list of `GlmCLuster`

, the
partitions are compared using as criterion the
**classification log-likelihood**, the higher are the
best partitions.

### Value

A an object of class `KmlCovList`

.

### See Also

glmClust

which_best

### Examples

1 2 3 |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.