# regress: General Regression for an Arbitrary Functional In uwIntroStats: Descriptive Statistics, Inference, Regression, and Plotting in an Introductory Statistics Course

## Description

Produces point estimates, interval estimates, and p values for an arbitrary functional (mean, geometric mean, proportion, median, quantile, odds) of a variable of class `integer`, `numeric`, `Surv`, when regressed on an arbitrary number of covariates. Multiple Partial F-tests can be specified using the `U` function.

## Usage

 ```1 2 3 4 5 6 7``` ```regress(fnctl, formula, data, intercept = fnctl!="hazard", strata = rep(1,n),weights=rep(1,n),id=1:n,ties="efron",subset=rep(TRUE,n), robustSE = TRUE, conf.level = 0.95, exponentiate = fnctl!="mean", replaceZeroes, useFdstn = TRUE, suppress = FALSE, na.action, method = "qr", model.f = TRUE, model.x = FALSE, model.y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset,control = list(...), init, ..., version=FALSE) ```

## Arguments

 `fnctl` a character string indicating the functional (summary measure of the distribution) for which inference is desired. Choices include `"mean"`, `"geometric mean"`, `"odds"`, `"rate"`, `"hazard"`. The character string may be shortened to a unique substring. Hence `"mea"` will suffice for `"mean"`. `formula` an object of class `formula` as might be passed to `lm`, `glm`, or `coxph`. `data` a data frame, matrix, or other data structure with matching names to those entered in `formula`. `intercept` a logical value indicating whether a intercept exists or not. `strata` vector indicating a variable to be used for stratification in proportional hazards regression. `weights` vector indicating optional weights for weighted regression. `id` vector with ids for the variables. If any ids are repeated, runs a clustered regression. `ties` One of `"efron"` (by default), `"breslow"`, or `"exact"`. Determines the method used to handle ties in proportional hazard regression. `subset` vector indicating a subset to be used for all inference. `robustSE` a logical indicator that standard errors are to be computed using the Huber-White sandwich estimator. `conf.level` a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95. `exponentiate` a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean. `replaceZeroes` if not `FALSE`, this indicates a value to be used in place of zeroes when computing a geometric mean. If `TRUE`, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used. `useFdstn` a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic and proportional hazard regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model. `suppress` if `TRUE`, and a model which requires exponentiation (for instance, regression on the geometric mean) is computed, then a table with only the exponentiated coefficients and confidence interval is returned. Otherwise, two tables are returned - one with the original unexponentiated coefficients, and one with the exponentiated coefficients. `na.action, method, model.f, model.x, model.y, qr, singular.ok, offset, contrasts, control` optional arguments that are passed to the functionality of `lm` or `glm`. `init` optional argument that are passed to the functionality of `coxph`. `...` other arbitrary parameters. `version` if `TRUE`, returns the version of the function. No other computation is performed.

## Details

Regression models include linear regression (for the “mean” functional), logistic regression (for the “odds” functional), Poisson regression (for the “rate” functional). Proportional hazards regression is currently not supported in the `regress` function. Objects created using the `U` function can also be passed in. If the `U` call involves a partial formula of the form `~ var1 + var2`, then `regress` will return a multiple-partial F-test involving `var1` and `var2`. The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can follow them).

## Value

An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix \$augCoefficients.

## Author(s)

Scott S. Emerson, M.D., Ph.D., Andrew J. Spieker, Brian D. Williamson, Travis Hee Wai

Functions for fitting linear models (`lm`), generalized linear models (`glm`), proportional hazards models (`coxph`), and generalized estimating equations (`geeglm`). Also see the function to specify multiple-partial F-tests, `U`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23``` ```# Loading required libraries library(survival) library(sandwich) # Reading in a dataset mri <- read.table("http://www.emersonstatistics.com/datasets/mri.txt",header=TRUE) # Creating a Surv object to reflect time to death mri\$ttodth <- Surv(mri\$obstime,mri\$death) # Attaching the mri dataset attach(mri) # Linear regression of atrophy on age regress("mean", atrophy~age, data=mri) ## Linear regression of atrophy on male and race and their interaction, ## with a multiple-partial F-test on the race-age interaction regress("mean", atrophy~ male + U(ra=~race*age), data=mri) ## Linear regression of atrophy on age, male, race (as a dummy variable), chf, ## and diabetes. There are two multiple partial F-tests and both are named regress("mean", atrophy~age+male+U(rc=~dummy(race)+chf)+U(md=~male+diabetes), data=mri) ```