mmdata: Reformat input data for performance evaluation calculation
In takayasaito/precrec: Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

mmdata

R Documentation

Reformat input data for performance evaluation calculation

Description

The mmdata function takes predicted scores and labels and returns an mdat object. The evalmod function takes an mdat object as input data to calculate evaluation measures.

Usage

mmdata(
  scores,
  labels,
  modnames = NULL,
  dsids = NULL,
  posclass = NULL,
  na_worst = TRUE,
  ties_method = "equiv",
  expd_first = NULL,
  mode = "rocprc",
  nfold_df = NULL,
  score_cols = NULL,
  lab_col = NULL,
  fold_col = NULL,
  ...
)

Arguments

`scores`	A numeric dataset of predicted scores. It can be a vector, a matrix, an array, a data frame, or a list. The `join_scores` function can be useful to make scores with multiple datasets.
`labels`	A numeric, character, logical, or factor dataset of observed labels. It can be a vector, a matrix, an array, a data frame, or a list. The `join_labels` function can be useful to make labels with multiple datasets.
`modnames`	A character vector for the names of the models. The `evalmod` function automatically generates default names as "m1", "m2", "m3", and so on when it is `NULL`.
`dsids`	A numeric vector for test dataset IDs. The `evalmod` function automatically generates the default ID as `1` when it is `NULL`.
`posclass`	A scalar value to specify the label of positives in `labels`. It must be the same data type as `labels`. For example, `posclass = -1` changes the positive label from `1` to `-1` when `labels` contains `1` and `-1`. The positive label will be automatically detected when `posclass` is `NULL`.
`na_worst`	A Boolean value for controlling the treatment of NAs in `scores`. TRUE All NAs are treated as the worst scores FALSE All NAs are treated as the best scores
`ties_method`	A string for controlling ties in `scores`. "equiv" Ties are equivalently ranked "first" Ties are ranked in an increasing order as appeared "random" Ties are ranked in random order
`expd_first`	A string to indicate which of the two variables - model names or test dataset IDs should be expanded first when they are automatically generated. "modnames" Model names are expanded first. For example, The `mmdata` function generates `modnames` as `c("m1", "m2")` and `dsids` as `c(1, 1)` when two vectors are passed as input, and `modnames` and `dsids` are unspecified. "dsids" Test dataset IDs are expanded first. For example, The `mmdata` function generates `modnames` as `c("m1", "m1")` and `dsids` as `c(1, 2)` when two vectors are passed as input, and `modnames` and `dsids` are unspecified.
`mode`	A string that specifies the types of evaluation measures that the `evalmod` function calculates. "rocprc" ROC and Precision-Recall curves "prcroc" Same as above "basic" Normalized ranks vs. accuracy, error rate, specificity, sensitivity, precision, Matthews correlation coefficient, and F-score. "aucroc" Fast AUC(ROC) calculation with the U statistic
`nfold_df`	A data frame that contains at least one score column, label and fold columns.
`score_cols`	A character/numeric vector that specifies score columns of `nfold_df`.
`lab_col`	A number/string that specifies the label column of `nfold_df`.
`fold_col`	A number/string that specifies the fold column of `nfold_df`.
`...`	Not used by this method.

Value

The mmdata function returns an mdat object that contains formatted labels and score ranks. The object can be used as input data for the evalmod function.

Examples


##################################################
### Single model & single test dataset
###

## Load a dataset with 10 positives and 10 negatives
data(P10N10)

## Generate mdat object
ssmdat1 <- mmdata(P10N10$scores, P10N10$labels)
ssmdat1
ssmdat2 <- mmdata(1:8, sample(c(0, 1), 8, replace = TRUE))
ssmdat2


##################################################
### Multiple models & single test dataset
###

## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(1, 100, 100, "all")

## Multiple models & single test dataset
msmdat1 <- mmdata(samps[["scores"]], samps[["labels"]],
  modnames = samps[["modnames"]]
)
msmdat1

## Use join_scores and join_labels
s1 <- c(1, 2, 3, 4)
s2 <- c(5, 6, 7, 8)
scores <- join_scores(s1, s2)

l1 <- c(1, 0, 1, 1)
l2 <- c(1, 0, 1, 1)
labels <- join_labels(l1, l2)

msmdat2 <- mmdata(scores, labels, modnames = c("ms1", "ms2"))
msmdat2


##################################################
### Single model & multiple test datasets
###

## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(10, 100, 100, "good_er")

## Single model & multiple test datasets
smmdat <- mmdata(samps[["scores"]], samps[["labels"]],
  modnames = samps[["modnames"]],
  dsids = samps[["dsids"]]
)
smmdat


##################################################
### Multiple models & multiple test datasets
###

## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(10, 100, 100, "all")

## Multiple models & multiple test datasets
mmmdat <- mmdata(samps[["scores"]], samps[["labels"]],
  modnames = samps[["modnames"]],
  dsids = samps[["dsids"]]
)
mmmdat


##################################################
### N-fold cross validation datasets
###

## Load test data
data(M2N50F5)
head(M2N50F5)

## Speficy nessesary columns to create mdat
cvdat1 <- mmdata(
  nfold_df = M2N50F5, score_cols = c(1, 2),
  lab_col = 3, fold_col = 4,
  modnames = c("m1", "m2"), dsids = 1:5
)
cvdat1

## Use column names
cvdat2 <- mmdata(
  nfold_df = M2N50F5, score_cols = c("score1", "score2"),
  lab_col = "label", fold_col = "fold",
  modnames = c("m1", "m2"), dsids = 1:5
)
cvdat2

takayasaito/precrec documentation built on Oct. 19, 2023, 7:28 p.m.

takayasaito/precrec index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

takayasaito/precrec
Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

mmdata: Reformat input data for performance evaluation calculation
In takayasaito/precrec: Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

Reformat input data for performance evaluation calculation

Description

Usage

Arguments

Value

See Also

Examples

Related to mmdata in takayasaito/precrec...

R Package Documentation

Browse R Packages

We want your feedback!

takayasaito/precrec Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

mmdata: Reformat input data for performance evaluation calculation In takayasaito/precrec: Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

Reformat input data for performance evaluation calculation

Description

Usage

Arguments

Value

See Also

Examples

Related to mmdata in takayasaito/precrec...

R Package Documentation

Browse R Packages

We want your feedback!

takayasaito/precrec
Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves

mmdata: Reformat input data for performance evaluation calculation
In takayasaito/precrec: Calculate Accurate Precision-Recall and ROC (Receiver Operator Characteristics) Curves