genderizeBootstrapError: Gender prediction errors on bootstrap samples

Description Usage Arguments Value See Also Examples

View source: R/genderizeBootstrapError.R

Description

genderizeBootstrapError calculates the Apparent Error Rate, the Leave-One-Out bootstrap error rate, and the .632+ error rate from Efron and Tibishirani (1997). The code is modified version of several functions from sortinghat package by John A. Ramey.

Usage

1
2
genderizeBootstrapError(x, y, givenNamesDB, probs, counts,
  num_bootstraps = 50, parallel = FALSE)

Arguments

x

A text vector that we want to genderize

y

A text vector of true gender labels ('female' or 'male') for x vector

givenNamesDB

A dataset with gender data (could be an output of findGivenNames function)

probs

A numeric vector of different probability values. Used to subseting a givenNamesDB dataset

counts

A numeric vector of different count values. Used to subseting a givenNamesDB dataset

num_bootstraps

Number of bootstrap samples. Default is 50.

parallel

It is passed to genderizeTrain function. If TRUE it computes errors with the use of parallel package and available cores. Default is FALSE.

Value

A list of bootstrap errors:

apparent

Apparent Error Rate

loo_boot

LOO-Boot Error Rate

errorRate632plus

.632+ Error Rate

See Also

In the sortinghat package.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 

x <- c('Alex', 'Darrell', 'Kale', 'Lee', 'Robin', 'Terry', rep('Robin', 20))

y <- c(rep('female', 6), rep('male', 20))

givenNamesDB = findGivenNames(x)
pred = genderize(x, givenNamesDB)
classificationErrors(labels = y, predictions = pred$gender)

probs = seq(from =  0.5, to = 0.9, by = 0.05)
counts = c(1)

set.seed(23)
genderizeBootstrapError(x = x, y = y, 
                         givenNamesDB = givenNamesDB, 
                         probs = probs, counts = counts, 
                         num_bootstraps = 20, 
                         parallel = TRUE)


# $apparent
# [1] 0.9615385

# $loo_boot
# [1] 0.965812

# $errorRate632plus
# [1] 0.964225



## End(Not run)

genderizeR documentation built on Aug. 4, 2019, 5:02 p.m.