create_testset: Create a list of test datasets

View source: R/data_interface.R

create_testsetR Documentation

Create a list of test datasets

Description

The create_testset function creates test datasets either for benchmarking or curve evaluation.

Usage

create_testset(test_type, set_names = NULL)

Arguments

test_type

A single string to specify the type of dataset generated by this function.

"bench"

Create test datasets for benchmarking

"curve"

Create test datasets for curve evaluation

set_names

A character vector to specify the names of test datasets.

  1. For benchmarking (test_type = "bench")

    This function uses a naming convention for randomly generated data for benchmarking. The format is a prefix ('i' or 'b') followed by the number of dataset. The prefix 'i' indicates a balanced dataset, whereas 'b' indicates an imbalanced dataset. The number can be used with a suffix 'k' or 'm', indicating respectively 1000 or 1 million.

    Below are some examples.

    "b100"

    A balanced data set with 50 positives and 50 negatives.

    "b10k"

    A balanced data set with 5000 positives and 5000 negatives.

    "b1m"

    A balanced data set with 500,000 positives and 500,000 negatives.

    "i100"

    An imbalanced data set with 25 positives and 75 negatives.

    The function returns a list of TestDataB objects.

  2. For curve evaluation (test_type = "curve")

    The following three predefined datasets can be specified for curve evaluation.

    set name S3 object data source
    c1 or C1 TestDataC C1DATA
    c2 or C2 TestDataC C2DATA
    c3 or C3 TestDataC C3DATA
    c4 or C4 TestDataC C4DATA

    The function returns a list of TestDataC objects.

Value

A list of R6 test dataset objects.

See Also

run_benchmark and run_evalcurve require the list of the datasets generated by this function. TestDataB for benchmarking test data. TestDataC, C1DATA, C2DATA, C3DATA, and C4DATA for curve evaluation test data. create_usrdata for creating a user-defined test set.

Examples

## Create a balanced data set with 50 positives and 50 negatives
tset1 <- create_testset("bench", "b100")
tset1

## Create an imbalanced data set with 25 positives and 75 negatives
tset2 <- create_testset("bench", "i100")
tset2

## Create P1 dataset
tset3 <- create_testset("curve", "c1")
tset3

## Create P1 dataset
tset4 <- create_testset("curve", c("c1", "c2"))
tset4


prcbench documentation built on March 31, 2023, 5:27 p.m.