lantern_linear_reg: Fit a linear regression model

Description Usage Arguments Details Value Examples

View source: R/lantern_linear_reg-fit.R

Description

lantern_linear_reg() fits a model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
lantern_linear_reg(x, ...)

## Default S3 method:
lantern_linear_reg(x, ...)

## S3 method for class 'data.frame'
lantern_linear_reg(
  x,
  y,
  epochs = 20L,
  penalty = 0.001,
  validation = 0,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'matrix'
lantern_linear_reg(
  x,
  y,
  epochs = 20L,
  penalty = 0.001,
  validation = 0,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'formula'
lantern_linear_reg(
  formula,
  data,
  epochs = 20L,
  penalty = 0.001,
  validation = 0,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

## S3 method for class 'recipe'
lantern_linear_reg(
  x,
  data,
  epochs = 20L,
  penalty = 0.001,
  validation = 0,
  learn_rate = 0.01,
  momentum = 0,
  batch_size = NULL,
  conv_crit = -Inf,
  verbose = FALSE,
  ...
)

Arguments

x

Depending on the context:

  • A data frame of predictors.

  • A matrix of predictors.

  • A recipe specifying a set of preprocessing steps created from recipes::recipe().

The predictor data should be standardized (e.g. centered or scaled).

...

Not currently used, but required for extensibility.

y

When x is a data frame or matrix, y is the outcome specified as:

  • A data frame with 1 numeric column.

  • A matrix with 1 numeric column.

  • A numeric vector.

epochs

An integer for the number of epochs of training.

penalty

The amount of weight decay (i.e., L2 regularization).

validation

The proportion of the data randomly assigned to a validation set.

learn_rate

A positive number (usually less than 0.1).

momentum

A positive number on [0, 1] for the momentum parameter in gradient decent.

batch_size

An integer for the number of training set points in each batch.

conv_crit

A non-negative number for convergence.

verbose

A logical that prints out the iteration history.

formula

A formula specifying the outcome terms on the left-hand side, and the predictor terms on the right-hand side.

data

When a recipe or formula is used, data is specified as:

  • A data frame containing both the predictors and the outcome.

Details

The predictors data should all be numeric and encoded in the same units (e.g. standardized to the same range or distribution). If there are factor predictors, use a recipe or formula to create indicator variables (or some other method) to make them numeric.

The function internally standardizes the outcome data to have mean zero and a standard deviation of one. The prediction function creates predictions on the original scale.

If conv_crit is used, it stops training when the difference in the loss function is below conv_crit or if it gets worse. The default trains the model over the specified number of epochs.

Value

A lantern_linear_reg object with elements:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
if (torch::torch_is_installed()) {

 ## -----------------------------------------------------------------------------

 data(ames, package = "modeldata")

 ames$Sale_Price <- log10(ames$Sale_Price)

 set.seed(122)
 in_train <- sample(1:nrow(ames), 2000)
 ames_train <- ames[ in_train,]
 ames_test  <- ames[-in_train,]


 # Using matrices
 set.seed(1)
 lantern_linear_reg(x = as.matrix(ames_train[, c("Longitude", "Latitude")]),
                    y = ames_train$Sale_Price,
                    penalty = 0.10, epochs = 20, batch_size = 32)

 # Using recipe
 library(recipes)

 ames_rec <-
  recipe(Sale_Price ~ Bldg_Type + Neighborhood + Year_Built + Gr_Liv_Area +
         Full_Bath + Year_Sold + Lot_Area + Central_Air + Longitude + Latitude,
         data = ames_train) %>%
    # Transform some highly skewed predictors
    step_BoxCox(Lot_Area, Gr_Liv_Area) %>%
    # Lump some rarely occuring categories into "other"
    step_other(Neighborhood, threshold = 0.05)  %>%
    # Encode categorical predictors as binary.
    step_dummy(all_nominal(), one_hot = TRUE) %>%
    # Add an interaction effect:
    step_interact(~ starts_with("Central_Air"):Year_Built) %>%
    step_zv(all_predictors()) %>%
    step_normalize(all_predictors())

 set.seed(2)
 fit <- lantern_linear_reg(ames_rec, data = ames_train,
                           epochs = 20, batch_size = 32)
 fit

 autoplot(fit)

 library(ggplot2)

 predict(fit, ames_test) %>%
   bind_cols(ames_test) %>%
   ggplot(aes(x = .pred, y = Sale_Price)) +
   geom_abline(col = "green") +
   geom_point(alpha = .3) +
   lims(x = c(4, 6), y = c(4, 6)) +
   coord_fixed(ratio = 1)

 library(yardstick)
 predict(fit, ames_test) %>%
   bind_cols(ames_test) %>%
   rmse(Sale_Price, .pred)

 }

tidymodels/lantern documentation built on March 8, 2021, 8:53 a.m.