gen_bin_data: generate the data used for the model experiment

Description Usage Arguments Details Value References See Also Examples

View source: R/gen_bin_data.R

Description

gen_bin_data generate the data used for the model experiment

Usage

1
gen_bin_data(beta, N, nclass, seed)

Arguments

beta

A numeric vector that represents the true coefficients that used to generate the synthesized data.

N

A numeric number specifying the number of the synthesized data. It should be an integer.

nclass

A numeric number used to specify how many clusters the original data would be transformed into. It should be an integer.

seed

Set random number seed.

Details

The function gen_bin_data generates N points. That is,the first column of the design matrix is 1 and the second column has a normal distribution with a mean of 1 and a variance of 1 and the rest columns with a mean of 0 and a variance of 1. Next, they are clustered into classes to decrease the computation cost. You should specify the number of classes. In the function, it's the parameter nclass.

Value

a list of seven elements:

data.clust

list with clustering results. Samples in the same list element are closer with each other

X

the samples with the smallest variance from each cluster. Note that the length of X is the same as the number of data.clust

y

the target value of 0 or 1 corresponding to X

References

Wang Z, Kwon Y, Chang YcI (2019). Active learning for binary classification with variable selection. arXiv preprint arXiv:1901.10079.

See Also

gen_multi_data for categorical and ordinal case

gen_GEE_data for generalized estimating equations case.

Examples

1
# For an example, see example(seq_bin_model)

seqest documentation built on July 2, 2020, 2:28 a.m.

Related to gen_bin_data in seqest...