regclustcurves: Clustering multiple regression curves
In clustcurv: Determining Groups in Multiples Curves

View source: R/regclustcurves.R

regclustcurves

R Documentation

Clustering multiple regression curves

Description

Function for grouping regression curves based on the k-means or k-medians algorithm. It returns the number of groups and the assignment.

Usage

regclustcurves(
  y,
  x,
  z,
  kvector = NULL,
  kbin = 50,
  h = -1,
  nboot = 100,
  algorithm = "kmeans",
  alpha = 0.05,
  cluster = FALSE,
  ncores = NULL,
  seed = NULL,
  multiple = FALSE,
  multiple.method = "holm"
)

Arguments

`y`	Response variable.
`x`	Dependent variable.
`z`	Categorical variable indicating the population to which the observations belongs.
`kvector`	A vector specifying the number of groups of curves to be checking.
`kbin`	Size of the grid over which the survival functions are to be estimated.
`h`	The kernel bandwidth smoothing parameter.
`nboot`	Number of bootstrap repeats.
`algorithm`	A character string specifying which clustering algorithm is used, i.e., k-means(`"kmeans"`) or k-medians (`"kmedians"`).
`alpha`	Significance level of the testing procedure. Defaults to 0.05.
`cluster`	A logical value. If `TRUE` (default), the testing procedure is parallelized. Note that there are cases (e.g., a low number of bootstrap repetitions) that R will gain in performance through serial computation. R takes time to distribute tasks across the processors also it will need time for binding them all together later on. Therefore, if the time for distributing and gathering pieces together is greater than the time need for single-thread computing, it does not worth parallelize.
`ncores`	An integer value specifying the number of cores to be used in the parallelized procedure. If `NULL` (default), the number of cores to be used is equal to the number of cores of the machine - 1.
`seed`	Seed to be used in the procedure.
`multiple`	A logical value. If `TRUE` (not default), the resulted pvalues are adjusted by using one of several methods for multiple comparisons.
`multiple.method`	Correction method. See Details.

Details

The adjustment methods include the Bonferroni correction ("bonferroni") in which the p-values are multiplied by the number of comparisons. Less conservative corrections are also included by Holm (1979) ('holm'), Hochberg (1988) ('hochberg'), Hommel (1988) ('hommel'), Benjamini & Hochberg (1995) ('BH' or its alias 'fdr'), and Benjamini & Yekutieli (2001) ('BY'), respectively. A pass-through option ('none') is also included.

Value

A list containing the following items:

`table`	A data frame containing the null hypothesis tested, the values of the test statistic and the obtained pvalues.
`levels`	Original levels of the variable `z`.
`cluster`	A vector of integers (from 1:k) indicating the cluster to which each curve is allocated.
`centers`	An object containing the centroids (mean of the curves pertaining to the same group).
`curves`	An object containing the fitted curves for each population.

Author(s)

Nora M. Villanueva and Marta Sestelo.

Examples

library(clustcurv)

# Regression framework
res <- regclustcurves(y = barnacle5$DW, x = barnacle5$RC, z = barnacle5$F,
algorithm = 'kmeans', nboot = 2, cluster = TRUE, ncores = 2)

clustcurv documentation built on Dec. 6, 2025, 5:10 p.m.