splitMedian: Discretise continuous data in multiple granularities

Description Usage Arguments Value See Also Examples

View source: R/ecpc.R

Description

Discretise continuous co-data by making groups of covariates of various size. The first group is the group with all covariates. Each group is then recursively split in two at the median co-data value, until some user-specified minimum group size is reached. The discretised groups are used for adaptive discretisation of continuous co-data.

Usage

1
2
splitMedian(values, index=NULL, depth=NULL, minGroupSize = 50, first = TRUE, 
  split = c("both","lower","higher"))

Arguments

values

Vector with the continuous co-data values to be discretised.

index

Index of the covariates corresponding to the values supplied. Useful if part of the continuous co-data is missing and only the non-missing part should be discretised.

depth

(optional): if given, a discretisation is returned with 'depth' levels of granularity.

minGroupSize

Minimum group size that each group of covariates should have.

split

"both", "lower" or "higher": should both split groups of covariates be further split, or only the group of covariates that corresponds to the lower or higher continuous co-data group?

first

Do not change, recursion help variable.

Value

A list with groups of covariates, which may be used as group set in ecpc.

See Also

Use obtainHierarchy to obtain a group set on group level defining the hierarchy for adaptive discretisation of continuous co-data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
cont.codata <- seq(0,1,length.out=20) #continuous co-data
#full tree with minimum group size 5
groupset1 <- splitMedian(values=cont.codata,minGroupSize=5) 
#only split at lower continous co-data group
groupset2 <- splitMedian(values=cont.codata,split="lower",minGroupSize=5) 

part <- sample(1:length(cont.codata),15) #discretise only for a part of the continuous co-data
cont.codata[-part] <- NaN #suppose rest is missing
#make group set of non-missing values
groupset3 <- splitMedian(values=cont.codata[part],index=part,minGroupSize=5) 
groupset3 <- c(groupset3,list(which(is.nan(cont.codata)))) #add missing data group

ecpc documentation built on May 3, 2021, 9:08 a.m.