get_thresholds: Split surrogate into 3 ordered categories based on the number...

View source: R/get_thresholds.R

get_thresholdsR Documentation

Split surrogate into 3 ordered categories based on the number of feature occurences

Description

Either fit 3 components Gaussian mixture to determine cutpoints for surrogate variable of uses the most extremes quantiles

Usage

get_thresholds(x, method = c("quantiles", "GaussianMixture"), p = 0.1,
  ...)

Arguments

x

a feature occurence count vector to be split into 3 components

method

a character string indicating which splitting method is used : either "GaussianMixture" or "quantiles". Default is "quantiles".

p

the probability of extemes used in the quantile splitting method (ignored if method is "GaussianMixture"). Default is 0.01.

...

further agruments not used here

Value

a list with the following components:

  • thesholdsa list containing the lower threshold lowthres marking the difference between the two classes 0 (the fewest occurences) and 0.5; and the upper threshold upthres marking the difference between the two classes 0.5 and 1 (the most occurences)

  • surrogate a vector containing the clustering for each observation (as an ordered factor with the 3 following levels ordered by the number of occurences: 0 < 0.5 < 1.)


borishejblum/phenotypr documentation built on May 2, 2022, 11:04 p.m.