knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Description

Bag of Little Bootstraps (blb) allows users to utilize bootstrapping to create regression models, specifically linear regression models or logistic regression models. Users are also given the option to use more than one CPU when fitting a model to the data. Note, if users wish to utilize this parallelization feature, they first must run furrr::plan(multisession, worker = cl) in the console (cl being the number of CPU's they wish to use.)

The library has two main functions, each of which come with a set of methods.

\newline

Functions and Methods

Functions
  1. blblm(formula, data, m = 10, B = 5000, parallel = FALSE, cl = NULL) : Utilize bootsrapping to fit a linear regression model to the data, as specified in the data argument. If parallel = TRUE, then parallelization is used. Then the user can define the number of desired number CPUs in the cl argument. However, as noted in the description, the user must run furrr::plan(multisession, worker = cl) in the console before using this parallelization feature. The arguments B and m are the number of bootstrapped samples, and the number of sugroups in which to split the data during the bootstrapping process.

    • Methods:

      • print(x, ...) : Prints the model in the form of $Y$ ~ $X_{1}$ + $X_{2}$

      • sigma(object, confidence = FALSE, level = 0.95, ...): Returns estimate of $\sigma$, the standard deviation of the response variable. If confidence = TRUE, then a confidence interval of is $\sigma$ included in the output. The default confidence level is 95%, and this can be changed using the level argument.

      • coef(object, ...): Returns a numeric vector containing the coefficients of the parameters in the model. For example, in the model $Y$ ~ $B_{1}X_{1} + B_{2}X_{2}$, this function would return a numeric vector containing $B_{1}$ and $B_{2}$.

      • confint(object, parm = NULL, level = 0.95, ...): By default, returns 95% confidence intervals for each parameter in the model (the predictor variables). User can request specific parameters using the parm argument, and can change the confidence level using the level argument.

      • predict(object, new_data, confidence = FALSE, level = 0.95, ...): Utilize the model to predict the response values of a new dataset. If confidence = TRUE a nx3 matrix is returned in which the first column is a the predicted value, and the second and third rows are the lower and upper endpoints of the corresponding 95% confidence interval for that prediction. The user can change the confidence level with the level argument.

  2. blb_logreg(formula, data, m = 10, B = 5000, parallel = FALSE, cl = NULL): Utilize bootsrapping to fit a logistic regression model, to the data as specified in the data argument. If parallel = TRUE, then parallelization is used. Then the user can define the number of desired number CPUs in the cl argument. However, as noted in the description, the user must run furrr::plan(multisession, worker = cl) in the console before using this parallelization feature. The arguments B and m are the number of bootstrapped samples, and the number of sugroups in which to split the data during the bootstrapping process.

    • Methods:

      • print(x, ...): Prints the model in the form of $Y$ ~ $X_{1}$ + $X_{2}$. Note, this is not the equation used to calculate the response variable, as we are using logistic regression. However, given that we have p parameters, the log-odds of a given observation of $Y$ is equal to $B_{1}X_{1} + B_{2}X_{2}$ + ... + $B_{p}X_{p}$.

      • coef(object, ...): Returns a numeric vector containing the coefficients of the parameters in the model. For example, if the log-odds were equal to $B_{1}X_{1} + B_{2}X_{2}$, this function would return a numeric vector containing $B_{1}$ and $B_{2}$.

      • confint(object, parm = NULL, level = 0.95, ...): By default, returns 95% confidence intervals for each parameter in the model (the predictor variables). User can request specific parameters using the parm argument, and can change the confidence level using the level argument.

      • predict(object, new_data, p = FALSE,...): Utilize the model to predict the response values of a new dataset. If p = TRUE, then the function returns P(Y = 1|X = x), instead of binary {0,1} prediction.

\newline

Examples

library(blb)
Examples concerning blblm()

We will use the iris dataset from the dataset library. See following link for more information. https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/iris

library(datasets)
data(iris)

train = iris[1:75,]
test = iris[76:150,]

Example 1: blblm()

fit = blblm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = train)
fit
fit = blblm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = train, m = 8, B = 100, parallel = TRUE, cl = 2)
fit

\newline

Example 2: print() method:

print(fit)

\newline

Example 3: sigma() method

*default

sigma(fit)
sigma(fit, confidence = TRUE, level = 0.9)

\newline

Example 4: coef() method

coef(fit)

\newline

Example 5: confint()

confint(fit)
confint(fit, parm = "Sepal.Width", level = 0.90)



confint(fit, parm = c("Sepal.Width", "Petal.Width"), level = 0.90)

Example 6: predict()

*default

predicted_values = predict(fit, test)
head(predicted_values)
predicted_values = predict(fit, test, confidence = TRUE, level = 0.90)
head(predicted_values)
Examples concerning blb_logreg()

We will use data from the titanic library. See following link for more information. https://www.rdocumentation.org/packages/titanic/versions/0.1.0

library(titanic)

train = titanic_train[1:550,]
test = titanic_train[551:891,]

Example 1: blb_logreg()

fit_logreg = blb_logreg(Survived ~ Age + Sex + Pclass, data = train)
fit_logreg
fit_logreg = blb_logreg(Survived ~ Age + Sex + Pclass, data = train, m = 8, B = 100, parallel = TRUE, cl = 2)
fit_logreg

\newline

Example 2: print() method

print(fit_logreg)

\newline Example 2: coef() method

coef(fit_logreg)

\newline

Example 3: confint() method

confint(fit_logreg)
confint(fit_logreg, parm = c("Age", "Sexmale"), level = 0.9)



confint(fit_logreg, parm = c("Age"), level = 0.9)

Example 4: predict() method

predicted_values = predict(fit_logreg, test)
head(predicted_values)
predicted_values = predict(fit_logreg, test, p = TRUE)
head(predicted_values)


jwebb1197/Bag_of_Little_Bootstraps documentation built on June 9, 2020, 12:13 a.m.