s_GLM: Generalized Linear Model (C, R)
In egenn/rtemis: Machine Learning and Visualization

s_GLM

R Documentation

Generalized Linear Model (C, R)

Description

Train a Generalized Linear Model for Regression or Classification (i.e. Logistic Regression) using stats::glm. If outcome y has more than two classes, Multinomial Logistic Regression is performed using nnet::multinom

Usage

s_GLM(
  x,
  y = NULL,
  x.test = NULL,
  y.test = NULL,
  x.name = NULL,
  y.name = NULL,
  family = NULL,
  interactions = NULL,
  class.method = NULL,
  weights = NULL,
  ifw = TRUE,
  ifw.type = 2,
  upsample = FALSE,
  downsample = FALSE,
  resample.seed = NULL,
  intercept = TRUE,
  polynomial = FALSE,
  poly.d = 3,
  poly.raw = FALSE,
  print.plot = FALSE,
  plot.fitted = NULL,
  plot.predicted = NULL,
  plot.theme = rtTheme,
  na.action = na.exclude,
  removeMissingLevels = TRUE,
  question = NULL,
  verbose = TRUE,
  trace = 0,
  outdir = NULL,
  save.mod = ifelse(!is.null(outdir), TRUE, FALSE),
  ...
)

Arguments

`x`	Numeric vector or matrix / data frame of features i.e. independent variables
`y`	Numeric vector of outcome, i.e. dependent variable
`x.test`	Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in `x`
`y.test`	Numeric vector of testing set outcome
`x.name`	Character: Name for feature set
`y.name`	Character: Name for outcome
`family`	Error distribution and link function. See `stats::family`
`interactions`	List of character pairs denoting column names in `x` that should be entered as interaction terms in the GLM formula
`class.method`	Character: Define "logistic" or "multinom" for classification. The only purpose of this is so you can try `nnet::multinom` instead of glm for binary classification
`weights`	Numeric vector: Weights for cases. For classification, `weights` takes precedence over `ifw`, therefore set `weights = NULL` if using `ifw`. Note: If `weight` are provided, `ifw` is not used. Leave NULL if setting `ifw = TRUE`.
`ifw`	Logical: If TRUE, apply inverse frequency weighting (for Classification only). Note: If `weights` are provided, `ifw` is not used.
`ifw.type`	Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights)
`upsample`	Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
`downsample`	Logical: If TRUE, downsample majority class to match size of minority class
`resample.seed`	Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
`intercept`	Logical: If TRUE, fit an intercept term.
`polynomial`	Logical: if TRUE, run lm on `poly(x, poly.d)` (creates orthogonal polynomials)
`poly.d`	Integer: degree of polynomial.
`poly.raw`	Logical: if TRUE, use raw polynomials. Default, which should not really be changed is FALSE
`print.plot`	Logical: if TRUE, produce plot using `mplot3` Takes precedence over `plot.fitted` and `plot.predicted`.
`plot.fitted`	Logical: if TRUE, plot True (y) vs Fitted
`plot.predicted`	Logical: if TRUE, plot True (y.test) vs Predicted. Requires `x.test` and `y.test`
`plot.theme`	Character: "zero", "dark", "box", "darkbox"
`na.action`	How to handle missing values. See `?na.fail`
`removeMissingLevels`	Logical: If TRUE, finds factors in `x.test` that contain levels not present in `x` and substitutes with NA. This would result in error otherwise and no predictions would be made, ending `s_GLM` prematurely
`question`	Character: the question you are attempting to answer with this model, in plain language.
`verbose`	Logical: If TRUE, print summary to screen.
`trace`	Integer: If higher than 0, will print more information to the console.
`outdir`	Path to output directory. If defined, will save Predicted vs. True plot, if available, as well as full model output, if `save.mod` is TRUE
`save.mod`	Logical: If TRUE, save all output to an RDS file in `outdir` `save.mod` is TRUE by default if an `outdir` is defined. If set to TRUE, and no `outdir` is defined, outdir defaults to `paste0("./s.", mod.name)`
`...`	Additional arguments

Details

A common problem with glm arises when the testing set containts a predictor with more levels than those in the same predictor in the training set, resulting in error. This can happen when training on resamples of a data set, especially after stratifying against a different outcome, and results in error and no prediction. s_GLM automatically finds such cases and substitutes levels present in x.test and not in x with NA.

Value

rtMod

Author(s)

E.D. Gennatas

Examples

x <- rnorm(100)
y <- .6 * x + 12 + rnorm(100) / 2
mod <- s_GLM(x, y)

egenn/rtemis documentation built on Feb. 11, 2025, 5:17 a.m.

egenn/rtemis index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

egenn/rtemis
Machine Learning and Visualization

s_GLM: Generalized Linear Model (C, R)
In egenn/rtemis: Machine Learning and Visualization

Generalized Linear Model (C, R)

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to s_GLM in egenn/rtemis...

R Package Documentation

Browse R Packages

We want your feedback!

egenn/rtemis Machine Learning and Visualization

s_GLM: Generalized Linear Model (C, R) In egenn/rtemis: Machine Learning and Visualization

Generalized Linear Model (C, R)

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to s_GLM in egenn/rtemis...

R Package Documentation

Browse R Packages

We want your feedback!

egenn/rtemis
Machine Learning and Visualization

s_GLM: Generalized Linear Model (C, R)
In egenn/rtemis: Machine Learning and Visualization