Description Usage Arguments Details Value References See Also Examples

View source: R/gen_multi_data.R

`gen_multi_data`

generate the data used for multiple-class
classification problems.

1 | ```
gen_multi_data(beta0, N, type, test_ratio)
``` |

`beta0` |
A numeric matrix that represent the true coefficient that used to generate the synthesized data. |

`N` |
A numeric number specifying the number of the synthesized data. It should be a integer. Note that the value shouldn't be too small. We recommend that the value be 10000. |

`type` |
A character string that determines which type of data will be generated, matching one of 'ord' or 'cat'. |

`test_ratio` |
A numeric number specifying proportion of test sets in all data. It should be a number between 0 and 1. Note that the value of the test_ratio should not be too large, it is best if this value is equal to 0.2-0.3. |

gen_multi_data creates training dataset and testing datasets. The beta0 is a p * k matrix which p is the length of true coefficient and (k + 1) represents the number of categories. The value of 'type' can be 'ord' or 'cat' . If it equals to 'ord', it means the data has an ordinal relation among classes ,which is common in applications (e.g., the label indicates the severity of a disease or product preference). If it is 'cat', it represents there is no such ordinal relations among classes. In addition, the response variable y are then generated from a multinomial distribution with the explanatory variables x generated from a multivariate normal distribution with mean vector equal to 0 and the identity covariance matrix.

a list containing the following components

`train_id` |
The id of the training samples |

`train` |
the training datasets. Note that the id of the data in the train dataset is the same as the train_id |

`test` |
the testing datasets |

Li, J., Chen, Z., Wang, Z., & Chang, Y. I. (2020). Active learning in
multiple-class classification problems via individualized binary models.
*Computational Statistics & Data Analysis*, 145, 106911.
doi:10.1016/j.csda.2020.106911

`gen_bin_data`

for binary classification case

`gen_GEE_data`

for generalized estimating equations case.

1 | ```
# For an example, see example(seq_ord_model)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.