dgp_twoclass: Data-Ggnerating Function for Two-Class Problem

View source: R/stablelearner.R

dgp_twoclassR Documentation

Data-Ggnerating Function for Two-Class Problem

Description

Data-generating function to generate artificial data sets of a classification problem with two response classes, denoted as "A" and "B".

Usage

  dgp_twoclass(n = 100, p = 4, noise = 16, rho = 0, 
    b0 = 0, b = rep(1, p), fx = identity)

Arguments

n

integer. Number of observations. The default is 100.

p

integer. Number of signal predictors. The default is 4.

noise

integer. Number of noise predictors. The default is 16.

rho

numeric value between -1 and 1 specifying the correlation between the signal predictors. The correlation is given by rho^k, where k is an integer value given by toeplitz structure. The default is 0 (no correlation between predictors).

b0

numeric value. Baseline probability for class "B" on the logit scale. The default is 0.

b

numeric value. Slope parameter for the predictors on the logit scale. The default is 1 for all predictors.

fx

a function that is used to transform the predictors. The default is identity (equivalent to no transformation).

Value

A data.frame including a column denoted as class that is a factor with two levels "A" and "B". All other columns represent the predictor variables (signal predictors followed by noise predictors) and are named by "x1", "x2", etc..

See Also

stability

Examples

dgp_twoclass(n = 200, p = 6, noise = 4)

stablelearner documentation built on Oct. 23, 2025, 3 a.m.