lantern_mlp: Fit a single layer neural network

Description Usage Arguments Details Value Examples

View source: R/lantern_mlp-fit.R

Description

lantern_mlp() fits a model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
lantern_mlp(x, ...)

## Default S3 method:
lantern_mlp(x, ...)

## S3 method for class 'data.frame'
lantern_mlp(
  x,
  y,
  epochs = 100L,
  hidden_units = 3L,
  activation = "relu",
  penalty = 0,
  dropout = 0,
  validation = 0.1,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'matrix'
lantern_mlp(
  x,
  y,
  epochs = 100L,
  hidden_units = 3L,
  activation = "relu",
  penalty = 0,
  dropout = 0,
  validation = 0.1,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'formula'
lantern_mlp(
  formula,
  data,
  epochs = 100L,
  hidden_units = 3L,
  activation = "relu",
  penalty = 0,
  dropout = 0,
  validation = 0.1,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'recipe'
lantern_mlp(
  x,
  data,
  epochs = 100L,
  hidden_units = 3L,
  activation = "relu",
  penalty = 0,
  dropout = 0,
  validation = 0.1,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

Arguments

x

Depending on the context:

  • A data frame of predictors.

  • A matrix of predictors.

  • A recipe specifying a set of preprocessing steps created from recipes::recipe().

The predictor data should be standardized (e.g. centered or scaled).

...

Not currently used, but required for extensibility.

y

When x is a data frame or matrix, y is the outcome specified as:

  • A data frame with 1 numeric column.

  • A matrix with 1 numeric column.

  • A numeric vector.

epochs

An integer for the number of epochs of training.

hidden_units

An integer for the number of hidden units, or a vector of integers. If a vector of integers, the model will have length(hidden_units) layers each with hidden_units[i] hidden units.

activation

A string for the activation function. Possible values are "relu", "elu", "tanh", and "linear". If hidden_units is a vector, activation can be a character vector with length equals to length(hidden_units) specifying the activation for each hidden layer.

penalty

The amount of weight decay (i.e., L2 regularization).

dropout

The proportion of parameters set to zero.

validation

The proportion of the data randomly assigned to a validation set.

learn_rate

A positive number (usually less than 0.1).

momentum

A positive number on [0, 1] for the momentum parameter in gradient decent.

batch_size

An integer for the number of training set points in each batch.

conv_crit

A non-negative number for convergence.

verbose

A logical that prints out the iteration history.

formula

A formula specifying the outcome terms on the left-hand side, and the predictor terms on the right-hand side.

data

When a recipe or formula is used, data is specified as:

  • A data frame containing both the predictors and the outcome.

Details

This function fits single layer, feed-forward neural network models for regression (when the outcome is a number) or classification (a factor). For regression, the mean squared error is optimized and cross-entropy is the loss function for classification.

The predictors data should all be numeric and encoded in the same units (e.g. standardized to the same range or distribution). If there are factor predictors, use a recipe or formula to create indicator variables (or some other method) to make them numeric.

When the outcome is a number, the function internally standardizes the outcome data to have mean zero and a standard deviation of one. The prediction function creates predictions on the original scale.

If conv_crit is used, it stops training when the difference in the loss function is below conv_crit or if it gets worse. The default trains the model over the specified number of epochs.

Value

A lantern_mlp object with elements:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
if (torch::torch_is_installed()) {

 ## -----------------------------------------------------------------------------
 # regression examples (increase # epochs to get better results)

 data(ames, package = "modeldata")

 ames$Sale_Price <- log10(ames$Sale_Price)

 set.seed(122)
 in_train <- sample(1:nrow(ames), 2000)
 ames_train <- ames[ in_train,]
 ames_test  <- ames[-in_train,]


 # Using matrices
 set.seed(1)
 lantern_mlp(x = as.matrix(ames_train[, c("Longitude", "Latitude")]),
             y = ames_train$Sale_Price,
             penalty = 0.10, epochs = 20, batch_size = 32)

 # Using recipe
 library(recipes)

 ames_rec <-
  recipe(Sale_Price ~ Bldg_Type + Neighborhood + Year_Built + Gr_Liv_Area +
         Full_Bath + Year_Sold + Lot_Area + Central_Air + Longitude + Latitude,
         data = ames_train) %>%
   # Transform some highly skewed predictors
   step_BoxCox(Lot_Area, Gr_Liv_Area) %>%
   # Lump some rarely occuring categories into "other"
   step_other(Neighborhood, threshold = 0.05)  %>%
   # Encode categorical predictors as binary.
   step_dummy(all_nominal(), one_hot = TRUE) %>%
   # Add an interaction effect:
   step_interact(~ starts_with("Central_Air"):Year_Built) %>%
   step_zv(all_predictors()) %>%
   step_normalize(all_predictors())

 set.seed(2)
 fit <- lantern_mlp(ames_rec, data = ames_train, hidden_units = 20,
                    dropout = 0.05, epochs = 20, batch_size = 32)
 fit

 autoplot(fit)

 library(ggplot2)

 predict(fit, ames_test) %>%
   bind_cols(ames_test) %>%
   ggplot(aes(x = .pred, y = Sale_Price)) +
   geom_abline(col = "green") +
   geom_point(alpha = .3) +
   lims(x = c(4, 6), y = c(4, 6)) +
   coord_fixed(ratio = 1)

 library(yardstick)
 predict(fit, ames_test) %>%
   bind_cols(ames_test) %>%
   rmse(Sale_Price, .pred)
 }

tidymodels/lantern documentation built on March 8, 2021, 8:53 a.m.