cvo_test_bs: [+] Test if data in folds is stratified and blocked

Description Usage Arguments Value Author(s) Examples

View source: R/cvo_test_bs.R

Description

This function run tests, that help to evaluate if data in folds is (a) stratified and (b) blocked.

Usage

1
2
3
4
5
6
7
cvo_test_bs(
  obj,
  stratify_by = NULL,
  block_by = NULL,
  data = NULL,
  n_col_show = 10
)

Arguments

obj

A list with validation/test set indices in folds. Note: If indices are from training set, the result will be incorrect.

stratify_by

A name of variable used for stratification.

block_by

A name of variable used for blocking.

data

A data frame, for which obj was created.

n_col_show

Number of blocking variable cross-tabulation's columns to be shown. Default is 10.

Value

Print tables that help to evaluate if data is (a) stratified, (b) blocked.

Author(s)

Vilmantas Gegzna

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
library(manyROC)

# [!!!] Load data
DataSet1 <- data.frame(ID = rep(1:20, each = 2),
  gr = gl(4, 10, labels = LETTERS[1:4]),
  .row = 1:40)

obj <- cvo_create_folds(data = DataSet1,
  stratify_by = "gr",
  block_by = "ID",
  returnTrain = FALSE)

cvo_test_bs(obj,
  stratify_by = "gr",
  block_by = "ID",
  data = DataSet1)



# >  ************************************************************
# >      Test for STRATIFICATION
# >
# >        A B C D      <<<     >>>              A    B    C    D
# >  Fold1 2 2 2 2  <-Counts | Proportions->  0.25 0.25 0.25 0.25
# >  Fold2 2 2 2 2  <-Counts | Proportions->  0.25 0.25 0.25 0.25
# >  Fold3 2 2 2 2  <-Counts | Proportions->  0.25 0.25 0.25 0.25
# >  Fold4 2 2 2 2  <-Counts | Proportions->  0.25 0.25 0.25 0.25
# >  Fold5 2 2 2 2  <-Counts | Proportions->  0.25 0.25 0.25 0.25
# >
# >  If stratified, the proportions of each group in each fold
# >  (row) should be (approximately) equal and with no zero values.
# >  ____________________________________________________________
# >  Test for BLOCKING: BLOCKED
# >
# >        1 2 3 4 5 6 7 8 9 10 ..
# >  Fold1 0 0 0 0 2 0 0 2 0  0 ..
# >  Fold2 2 0 0 0 0 2 0 0 0  0 ..
# >  Fold3 0 0 2 0 0 0 0 0 2  0 ..
# >  Fold4 0 0 0 2 0 0 2 0 0  0 ..
# >  Fold5 0 2 0 0 0 0 0 0 0  2 ..
# >
# >  Number of unique IDs in each fold (first 10 columns).
# >  If blocked, the same ID appears just in one fold.
# >  ************************************************************

GegznaV/multiROC documentation built on Sept. 15, 2020, 10:33 a.m.