gen_multi_data: Generate the training data and testing data for the...
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description Usage Arguments Details Value References See Also Examples

gen_multi_data generate the data used for multiple-class classification problems.

1	gen_multi_data(beta0, N, type, test_ratio)

`beta0`	A numeric matrix that represent the true coefficient that used to generate the synthesized data.
`N`	A numeric number specifying the number of the synthesized data. It should be a integer. Note that the value shouldn't be too small. We recommend that the value be 10000.
`type`	A character string that determines which type of data will be generated, matching one of 'ord' or 'cat'.
`test_ratio`	A numeric number specifying proportion of test sets in all data. It should be a number between 0 and 1. Note that the value of the test_ratio should not be too large, it is best if this value is equal to 0.2-0.3.

gen_multi_data creates training dataset and testing datasets. The beta0 is a p * k matrix which p is the length of true coefficient and (k + 1) represents the number of categories. The value of 'type' can be 'ord' or 'cat' . If it equals to 'ord', it means the data has an ordinal relation among classes ,which is common in applications (e.g., the label indicates the severity of a disease or product preference). If it is 'cat', it represents there is no such ordinal relations among classes. In addition, the response variable y are then generated from a multinomial distribution with the explanatory variables x generated from a multivariate normal distribution with mean vector equal to 0 and the identity covariance matrix.

a list containing the following components

`train_id`	The id of the training samples
`train`	the training datasets. Note that the id of the data in the train dataset is the same as the train_id
`test`	the testing datasets

Li, J., Chen, Z., Wang, Z., & Chang, Y. I. (2020). Active learning in multiple-class classification problems via individualized binary models. Computational Statistics & Data Analysis, 145, 106911. doi:10.1016/j.csda.2020.106911

gen_bin_data for binary classification case

gen_GEE_data for generalized estimating equations case.

1	# For an example, see example(seq_ord_model)

seqest documentation built on July 2, 2020, 2:28 a.m.

seqest index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

seqest
Sequential Method for Classification and Generalized Estimating Equations Problem

gen_multi_data: Generate the training data and testing data for the...
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to gen_multi_data in seqest...

R Package Documentation

Browse R Packages

We want your feedback!

seqest Sequential Method for Classification and Generalized Estimating Equations Problem

gen_multi_data: Generate the training data and testing data for the... In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to gen_multi_data in seqest...

R Package Documentation

Browse R Packages

We want your feedback!

seqest
Sequential Method for Classification and Generalized Estimating Equations Problem

gen_multi_data: Generate the training data and testing data for the...
In seqest: Sequential Method for Classification and Generalized Estimating Equations Problem