mdmb_regression: Several Regression Models with Prior Distributions and...
In mdmb: Model Based Treatment of Missing Data

View source: R/mdmb_regression.R

mdmb_regression

R Documentation

Several Regression Models with Prior Distributions and Sampling Weights

Description

Several regression functions which allow for sampling weights and prior distributions.

The function yjt_regression performs a linear regression in which the response variable is transformed according to the Yeo-Johnson transformation (Yeo & Johnson, 2000; see yjt_dist) and the residuals are distributed following the scaled t distribution. The degrees of freedom of the t distribution can be fixed or estimated (est_df=TRUE). The function bct_regression has same functionality like the Yeo-Johnson transformation but employs a Box-Cox transformation of the outcome variable.

The Yeo-Johnson transformation can be extended by a probit transformation (probit=TRUE) to cover the case of bounded variables on [0,1].

The function logistic_regression performs logistic regression for dichotomous data.

The function oprobit_regression performs ordinal probit regression for ordinal polytomous data.

Usage

#---- linear regression with Yeo-Johnson transformed scaled t distribution
yjt_regression(formula, data, weights=NULL, beta_init=NULL, beta_prior=NULL,
        df=Inf, lambda_fixed=NULL, probit=FALSE, est_df=FALSE, df_min=0.5, df_max=100,
        use_grad=2, h=1e-5, optimizer="optim", maxiter=300, control=NULL)

## S3 method for class 'yjt_regression'
coef(object, ...)
## S3 method for class 'yjt_regression'
logLik(object, ...)
## S3 method for class 'yjt_regression'
predict(object, newdata=NULL, trafo=TRUE,  ...)
## S3 method for class 'yjt_regression'
summary(object, digits=4, file=NULL, ...)
## S3 method for class 'yjt_regression'
vcov(object, ...)

#---- linear regression with Box-Cox transformed scaled t distribution
bct_regression(formula, data, weights=NULL, beta_init=NULL, beta_prior=NULL,
        df=Inf, lambda_fixed=NULL, est_df=FALSE, use_grad=2, h=1e-5,
        optimizer="optim", maxiter=300, control=NULL)

## S3 method for class 'bct_regression'
coef(object, ...)
## S3 method for class 'bct_regression'
logLik(object, ...)
## S3 method for class 'bct_regression'
predict(object, newdata=NULL, trafo=TRUE, ...)
## S3 method for class 'bct_regression'
summary(object, digits=4, file=NULL, ...)
## S3 method for class 'bct_regression'
vcov(object, ...)

#---- logistic regression
logistic_regression(formula, data, weights=NULL, beta_init=NULL,
         beta_prior=NULL, use_grad=2, h=1e-5, optimizer="optim", maxiter=300,
         control=NULL)

## S3 method for class 'logistic_regression'
coef(object, ...)
## S3 method for class 'logistic_regression'
logLik(object, ...)
## S3 method for class 'logistic_regression'
predict(object, newdata=NULL, ...)
## S3 method for class 'logistic_regression'
summary(object, digits=4, file=NULL, ...)
## S3 method for class 'logistic_regression'
vcov(object, ...)

#---- ordinal probit regression
oprobit_regression(formula, data, weights=NULL, beta_init=NULL,
        use_grad=2, h=1e-5, optimizer="optim", maxiter=300,
        control=NULL, control_optim_fct=NULL)

## S3 method for class 'oprobit_regression'
coef(object, ...)
## S3 method for class 'oprobit_regression'
logLik(object, ...)
## S3 method for class 'oprobit_regression'
predict(object, newdata=NULL, ...)
## S3 method for class 'oprobit_regression'
summary(object, digits=4, file=NULL, ...)
## S3 method for class 'oprobit_regression'
vcov(object, ...)

Arguments

`formula`	Formula
`data`	Data frame. The dependent variable must be coded as 0 and 1.
`weights`	Optional vector of sampling weights
`beta_init`	Optional vector of initial regression coefficients
`beta_prior`	Optional list containing priors of all parameters (see Examples for definition of this list).
`df`	Fixed degrees of freedom for scaled `t` distribution
`lambda_fixed`	Optional fixed value for `\lambda` for scaled `t` distribution with Yeo-Johnson transformation
`probit`	Logical whether probit transformation should be employed for bounded outcome in `yjt_regression`
`est_df`	Logical indicating whether degrees of freedom in `t` distribution should be estimated.
`df_min`	Minimum value for estimated degrees of freedom
`df_max`	Maximum value for estimated degrees of freedom
`use_grad`	Computation method for gradients in `stats::optim`. The value `0` is the internal approximation of `stats::optim` and applies the settings in mdmb (`\le`0.3). The specification `use_grad=1` uses the calculation of the gradient in `CDM::numerical_Hessian`. The value `2` is usually the most efficient calculation of the gradient.
`h`	Numerical differentiation parameter.
`optimizer`	Type of optimizer to be chosen. Options are `"nlminb"` (`stats::nlminb`) and the default `"optim"` (`stats::optim`)
`maxiter`	Maximum number of iterations
`control`	Optional arguments to be passed to optimization function (`stats::nlminb`) or `stats::optim`
`control_optim_fct`	Optional control argument for gradient in optimization
`object`	Object of class `logistic_regression`
`newdata`	Design matrix for `predict` function
`trafo`	Logical indicating whether fitted values should be on the transformed metric (`trafo=TRUE`) or the original metric (`trafo=FALSE`)
`digits`	Number of digits for rounding
`file`	File name if the `summary` output should be sunk into a file.
`...`	Further arguments to be passed.

Value

List containing values

`coefficients`	Estimated regression coefficients
`vcov`	Estimated covariance matrix
`partable`	Parameter table
`y`	Vector of values of dependent variable
`Xdes`	Design matrix
`weights`	Sampling weights
`fitted.values`	Fitted values in metric of probabilities
`linear.predictor`	Fitted values in metric of logits
`loglike`	Log likelihood value
`logprior`	Log prior value
`logpost`	Log posterior value
`deviance`	Deviance
`loglike_case`	Case-wise likelihood
`ic`	Information criteria
`R2`	Pseudo R-square value according to McKelvey and Zavoina

Author(s)

Alexander Robitzsch

References

McKelvey, R., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4(1), 103-120. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/0022250X.1975.9989847")}

Yeo, I.-K., & Johnson, R. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954-959. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/87.4.954")}

Examples

#############################################################################
# EXAMPLE 1: Simulated example logistic regression
#############################################################################

#--- simulate dataset
set.seed(986)
N <- 500
x <- stats::rnorm(N)
y <- 1*( stats::runif(N) < stats::plogis( -0.8 + 1.2 * x ) )
data <- data.frame( x=x, y=y )

#--- estimate logistic regression with mdmb::logistic_regression
mod1 <- mdmb::logistic_regression( y ~ x, data=data )
summary(mod1)

## Not run: 
#--- estimate logistic regression with stats::glm
mod1b <- stats::glm( y ~ x, data=data, family="binomial")
summary(mod1b)

#--- estimate logistic regression with prior distributions
b0 <- list( "dnorm", list(mean=0, sd=100) )  # first parameter
b1 <- list( "dcauchy", list(location=0, scale=2.5) )   # second parameter
beta_priors <- list( b0, b1 )  # order in list defines priors for parameters
#* estimation
mod2 <- mdmb::logistic_regression( y ~ x, data=data, beta_prior=beta_priors )
summary(mod2)

#############################################################################
# EXAMPLE 2: Yeo-Johnson transformed scaled t regression
#############################################################################

#*** create simulated data
set.seed(9865)
n <- 1000
x <- stats::rnorm(n)
y <- .5 + 1*x + .7*stats::rt(n, df=8 )
y <- mdmb::yj_antitrafo( y, lambda=.5 )
dat <- data.frame( y=y, x=x )
# display data
graphics::hist(y)

#--- Model 1: fit regression model with transformed normal distribution (df=Inf)
mod1 <- mdmb::yjt_regression( y ~ x, data=dat )
summary(mod1)

#--- Model 2: fit regression model with transformed scaled t distribution (df=10)
mod2 <- mdmb::yjt_regression( y ~ x, data=dat, df=10)
summary(mod2)

#--- Model 3: fit regression model with transformed normal distribution (df=Inf)
#             and fixed transformation parameter lambda of .5
mod3 <- mdmb::yjt_regression( y ~ x, data=dat, lambda_fixed=.5)
summary(mod3)

#--- Model 4: fit regression model with transformed normal distribution (df=Inf)
#             and fixed transformation parameter lambda of 1
#             -> This model corresponds to least squares regression
mod4 <- mdmb::yjt_regression( y ~ x, data=dat, lambda_fixed=1)
summary(mod4)

# fit with lm function
mod4b <- stats::lm( y ~ x, data=dat )
summary(mod4b)

#--- Model 5: fit regression model with estimated degrees of freedom
mod5 <- mdmb::yjt_regression( y ~ x, data=dat, est_df=TRUE)
summary(mod5)

#** compare log-likelihood values
logLik(mod1)
logLik(mod2)
logLik(mod3)
logLik(mod4)
logLik(mod4b)
logLik(mod5)

#############################################################################
# EXAMPLE 3: Regression with Box-Cox and Yeo-Johnson transformations
#############################################################################

#*** simulate data
set.seed(985)
n <- 1000
x <- stats::rnorm(n)
y <- .5 + 1*x + stats::rnorm(n, sd=.7 )
y <- mdmb::bc_antitrafo( y, lambda=.5 )
dat <- data.frame( y=y, x=x )

#--- Model 1: fit regression model with Box-Cox transformation
mod1 <- mdmb::bct_regression( y ~ x, data=dat )
summary(mod1)
#--- Model 2: fit regression model with Yeo-Johnson transformation
mod2 <- mdmb::yjt_regression( y ~ x, data=dat )
summary(mod2)
#--- compare fit
logLik(mod1)
logLik(mod2)

#############################################################################
# EXAMPLE 4: Ordinal probit regression
#############################################################################

#--- simulate data
set.seed(987)
N <- 1500
x <- stats::rnorm(N)
z <- stats::rnorm(N)
# regression coefficients
b0 <- -.5 ; b1 <- .6 ; b2 <- .1
# vector of thresholds
thresh <- c(-1, -.3, 1)
yast <- b0 + b1 * x + b2*z + stats::rnorm(N)
y <- as.numeric( cut( yast, c(-Inf,thresh,Inf) ) ) - 1
dat <- data.frame( x=x, y=y, z=z )

#--- probit regression
mod <- mdmb::oprobit_regression( formula=y ~ x + z + I(x*z), data=dat)
summary(mod)

## End(Not run)

mdmb documentation built on Sept. 11, 2024, 5:23 p.m.