regress: General Regression for an Arbitrary Functional

View source: R/regress.R

regressR Documentation

General Regression for an Arbitrary Functional

Description

Produces point estimates, interval estimates, and p values for an arbitrary functional (mean, geometric mean, proportion, odds, hazard) of a variable of class integer, or numeric when regressed on an arbitrary number of covariates. Multiple Partial F-tests can be specified using the U function.

Usage

regress(
  fnctl,
  formula,
  data,
  intercept = TRUE,
  weights = rep(1, nrow(data.frame(data))),
  subset = rep(TRUE, nrow(data.frame(data))),
  robustSE = TRUE,
  conf.level = 0.95,
  exponentiate = fnctl != "mean",
  replaceZeroes,
  useFdstn = TRUE,
  suppress = FALSE,
  na.action,
  method = "qr",
  qr = TRUE,
  singular.ok = TRUE,
  contrasts = NULL,
  init = NULL,
  ties = "efron",
  offset,
  control = list(...),
  ...
)

Arguments

fnctl

a character string indicating the functional (summary measure of the distribution) for which inference is desired. Choices include "mean", "geometric mean", "odds", "rate", "hazard".

formula

an object of class formula as might be passed to lm, glm, or coxph. Functions of variables, specified using dummy or polynomial may also be included in formula.

data

a data frame, matrix, or other data structure with matching names to those entered in formula.

intercept

a logical value indicating whether a intercept exists or not. Default value is TRUE for all functionals. Intercept may also be removed if a "-1" is present in formula. If "-1" is present in formula but intercept = TRUE is specified, the model will fit without an intercept. Note that when fnctl = "hazard", the intercept is always set to FALSE because Cox proportional hazards regression models do not explicitly estimate an intercept.

weights

vector indicating optional weights for weighted regression.

subset

vector indicating a subset to be used for all inference.

robustSE

a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.

conf.level

a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.

exponentiate

a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.

replaceZeroes

if not FALSE, this indicates a value to be used in place of zeroes when computing a geometric mean. If TRUE, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used. Defaults to TRUE when fnctl = "geometric mean". This parameter is always FALSE for all other values of fnctl.

useFdstn

a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.

suppress

if TRUE, and a model which requires exponentiation (for instance, regression on the geometric mean) is computed, then a table with only the exponentiated coefficients and confidence interval is returned. Otherwise, two tables are returned - one with the original unexponentiated coefficients, and one with the exponentiated coefficients.

na.action, qr, singular.ok, offset, contrasts, control

optional arguments that are passed to the functionality of lm or glm.

method

the method to be used in fitting the model. The default value for fnctl = "mean" and fnctl = "geometric mean" is "qr", and the default value for fnctl = "odds" and fnctl = "rate" is "glm.fit". This argument is passed into the lm() or glm() function, respectively. You may optionally specify method = "model.frame", which returns the model frame and does no fitting.

init

a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.

ties

a character string describing method for breaking ties in hazard regression. Only efron, breslow, or exact is accepted. See more details in the documentation for this argument in the survival::coxph function. Default to efron.

...

additional arguments to be passed to the lm function call

Details

Regression models include linear regression (for the “mean” functional), logistic regression with logit link (for the “odds” functional), Poisson regression with log link (for the “rate” functional), linear regression of a log-transformed outcome (for the “geometric mean” functional), and Cox proportional hazards regression (for the hazard functional).

Currently, for the hazard functional, only 'coxph' syntax is supported; in other words, using 'dummy', 'polynomial', and U functions will result in an error when 'fnctl = hazard'.

Note that the only possible link function in 'regress' with 'fnctl = odds"' is the logit link. Similarly, the only possible link function in 'regress' with 'fnctl = "rate"' is the log link.

Objects created using the U function can also be passed in. If the U call involves a partial formula of the form ~ var1 + var2, then regress will return a multiple-partial F-test involving var1 and var2. If an F-statistic will already be calculated regardless of the U specification, then any naming convention specified via name ~ var1 will be ignored. The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can follow them).

Value

An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.

See Also

Functions for fitting linear models (lm), and generalized linear models (glm). Also see the function to specify multiple-partial F-tests, U.

Examples

# Loading dataset
data(mri)

# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)

# Linear regression of atrophy on sex and height and their interaction, 
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)

# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)

# Cox regression of age on survival 
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)

rigr documentation built on Sept. 7, 2022, 1:05 a.m.