View source: R/runUnivariate.R
getUnivariate | R Documentation |
Based on the input model (or input formula), runs a series of models predicting the dependant variable with one independant variables at a time (univariate).
getUnivariate(mod = NULL,
full_formula = NULL,
df = NULL,
model_class = NULL,
model_family = NULL,
returnIntercept = FALSE)
# If starting with a fitted model:
getUnivariate(mod,
returnIntercept = FALSE)
# If starting with a formula and data:
getUnivariate(full_formula,
df,
model_class,
model_family, # only required for glm models
returnIntercept = FALSE)
mod |
A model object. Support types include lm, glm, polr. |
full_formula |
A formula object, for use when starting without a fitted model |
df |
A data frame, for use when starting without a fitted model |
model_class |
A string describing the model class e.g. "lm", "glm", or "polr". For use when starting without a fitted model |
model_family |
A string describing the model familily required in the glm model e.g. "binomial". For use when starting without a fitted model |
returnIntercept |
(optional) if set to |
Loops over each independent variable in the input model, predicting the dependent variable using that IV alone i.e. "univariate". Hence each row represents a seperate model*.
Input should be a fitted model with all the variables you want to test OR a formula object with all the variables you want to test. If a formula supplied then you also need to supply the data and the class of model e.g. "lm" (plus model family for glm models).
This is useful for comparing against a multivariate model containing the same IVs. The univariate results show the individual impact of each IV, while the full regression shows how they might interact. For example if two IV's are correlated, they might both come out as significant predictors in a univariate regression, but in a full regression their betas may impact each other in unpredictable ways (they may be suppressed or exagerated depending on the nature of their interaction). For lm and glm regressions you should also refer to the "tolerance" outputted by getOutput
to identify collinearity.
* NB: For factor and categorical variables, each level is outputted on its own line (except the baseline), but it's important to remember only one model is actually run. For example, consider predicting age using an "Income" variable with levels "Low", "Med" and "High". The univariate model run would be: lm(age ~ Income)
, and there would be two betas added to the result table, one for "IncomeMed" and one for "IncomeHigh". In these cases, if returnIntercept
is set to TRUE
, there is only one model intecept to output, so this value would be repeated for each level of this factor.
See also runUnivariate
which returns the result as a dataframe rather than copying to clipboard
# Running a linear regression using the built in mtcars data set...
m1 = lm(mpg ~ gear + factor(carb) + hp, mtcars)
# to run unvariate results for each IV and copy result to clipboard:
getUnivariate(m1)
# to also return the intercept for each univariate model, run:
getUnivariate(m1, returnIntercept = T)
# Running a version with a formula input...
getUnivariate(full_formula = mpg ~ gear + factor(carb) + hp,
df = mtcars,
model_class = "lm")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.