Home

/

GitHub

/

BetterReg: Better Statistics for OLS and Binomial Logistic Regression

title: 'BetterReg: An R package for Useful Regression Statistics' tags: - R - OLS Regression - Logistic Regression authors: - name: Christopher L. Aberson orcid: 0000-0003-3481-7177 affiliation: 1 affiliations: - name: Cal Poly Humboldt index: 1 date: 10 May 2022 bibliography: paper.bib

Summary

Statistics such as squared semi partial correlations, tolerance, and Mahalanobis Distances are useful for reporting the results of OLS (Ordinary Least Squares) Regression [@tabachnick_using_2019] as well as Likelihood Ratio Chi-square [@cohen_applied_2002] and Likelihood R-square [@menard_logistic_2010]. Such statistics are not part of base R [@r_core_team_r_2022] popular packages such as car [@fox_r_2019]. To fill these gaps, the BetterReg package is developed to provide these statistics and measures.

Squared semipartial correlations provide a measure of uniquely explained variances that is on the same scale as $R^2$ values. Tolerance values address multi-collinearity by addressing variance unexplained in a predictor. Mahalanabis Distance is a popular measure of multi-variate outliers that are presented on a $\chi^2$ scale. The Likelihood Ratio $\chi^2$ provides a significance test that is more stable than the commonly presented Wald Test and the Likelihood Ratio $\chi^2$ is the most widely recommended Pseudo $R^2$ statistic for the Logistic Regression.

The target audience for this package is researchers using OLS and Logistic Regression. Presently, there is not any R package that provides those statistics, so the calculation requires researchers to write their own code. These statistics are widely available in commercial programs such as SAS, SPSS, and Stata.

Usage

BetterReg functions require existing regression models (either OLS or Logistic for most statistics), dataset names (for some approaches), number of predictors (some functions), and desired amount of output (the Mahal function).

`part` function for squared semipartial correlations

The part function requires an existing LM model and indication of number of predictors:

    library(BetterReg)
    mymodel <- lm(y~x1+x2+x3+x4+x5, data = testreg)
    parts(model = mymodel, pred = 5)

## Predictor 1: semi partial = 0.032; squared semipartial = 0.001
## Predictor 2: semi partial = 0.307; squared semipartial = 0.094
## Predictor 3: semi partial = 0.268; squared semipartial = 0.072
## Predictor 4: semi partial = 0.134; squared semipartial = 0.018
## Predictor 5: semi partial = 0.241; squared semipartial = 0.058

`R2change` function for addressing improvement in $R^2$ between models

The R2change function requires two models. Each model must have the same number of rows:

    R2change(model1 = mymodel1, model2 = mymodel2)

## R-square change = 0.09
## F(2,995) = 54.764, p = 2.73174803699611e-23

`depbcomp` function for comparing dependent regression coefficients

The depbcomp function takes the required data and variable names as arguments. Dependent coefficients are coefficients from the same regression model:

    depbcomp(data = testreg, y = "y" , x1 = "x1" ,
                    x2 = "x2", x3 = "x3", x4 = "x4", x5 = "x5", 
                    numpred=5,comps="abs")

## Pred 1 vs. Pred 2  : t = 7.004, p = 4.57522908448027e-12
## Pred 1 vs. Pred 3  : t = 6.21, p = 7.79647457704868e-10
## Pred 1 vs. Pred 4  : t = 2.751, p = 0.00604702058333784
## Pred 1 vs. Pred 5  : t = 5.31, p = 1.3508334650858e-07
## Pred 2 vs. Pred 3  : t = 0.681, p = 0.495955077475793
## Pred 2 vs. Pred 4  : t = 4.189, p = 3.05299716290008e-05
## Pred 2 vs. Pred 5  : t = 1.612, p = 0.107363700946729
## Pred 3 vs. Pred 4  : t = 3.444, p = 0.000596991746199649
## Pred 3 vs. Pred 5  : t = 0.891, p = 0.373356929374812
## Pred 4 vs. Pred 5  : t = 2.553, p = 0.0108146623166698

`indbcomp` function for comparing independent regression coefficients

The indbcomp function requires data and variable names from two different samples. Independent coefficients are the coefficients obtained from different samples using the same regression model:

indbcomp(model1 = model1_2, model2 = model2_2, comps = "abs")

## Predictor 1:  t = 0.362, p = 0.718
## Predictor 2:  t = 0.265, p = 0.792

`tolerance` function for multicollinearity assumptions

The tolerance function requires only a model.

    mymodel <- lm(y~x1+x2+x3+x4+x5, data = testreg)
    tolerance(model = mymodel)

##        x1        x2        x3        x4        x5 
## 0.9976977 0.9990479 0.9931082 0.9953317 0.9980628

`Mahal` function for detecting multivariate outliers

The Mahal function requires model, predictors, and desired number of values to produce the output:

    mymodel <- lm(y~x1+x2+x3+x4+x5, data = testreg)
    Mahal(model = mymodel, pred = 5, values = 10)

##      537      770      342      760      299      982      446      174 
## 14.56342 15.03188 15.56224 15.60986 16.52869 16.80958 17.38597 18.11072 
##      458      530 
## 20.02762 25.09934

`LRchi` function for logistic regression likelihood ratio chi square

The LRchi function takes input for the dependent variable name (y), up to 10 predictors (x1, x2, etc.), and the number of predictors as follows:

    LRchi(data = testlog, y = "dv", x1 = "iv1", x2 = "iv2", numpred = 2)

## Predictor: iv1; LR squared 34.09, p= 0
## Predictor: iv2; LR squared 0.19, p= 0.67

`Pseudo` function for Logistic Regression Effect Size

The Pseudo function takes an existing model as input:

    mymodel <- glm(dv~iv1+iv2+iv3+iv4, testlog, family = binomial())
    pseudo(model = mymodel)

## Likelihood Ratio R-squared (McFadden, Recommended) = 0.26
## Cox-Snell R-squared) = 0.301
## Nagelkerk R-squared  = 0.402

References

chrisaberson/BetterReg documentation built on June 10, 2022, 10:03 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

chrisaberson/BetterReg
Better Statistics for OLS and Binomial Logistic Regression

JOSS/paper.md
In chrisaberson/BetterReg: Better Statistics for OLS and Binomial Logistic Regression

Summary

Usage

`part` function for squared semipartial correlations

`R2change` function for addressing improvement in $R^2$ between models

`depbcomp` function for comparing dependent regression coefficients

`indbcomp` function for comparing independent regression coefficients

`tolerance` function for multicollinearity assumptions

`Mahal` function for detecting multivariate outliers

`LRchi` function for logistic regression likelihood ratio chi square

`Pseudo` function for Logistic Regression Effect Size

References

R Package Documentation

Browse R Packages

We want your feedback!

chrisaberson/BetterReg Better Statistics for OLS and Binomial Logistic Regression

JOSS/paper.md In chrisaberson/BetterReg: Better Statistics for OLS and Binomial Logistic Regression

Summary

Usage

part function for squared semipartial correlations

R2change function for addressing improvement in $R^2$ between models

depbcomp function for comparing dependent regression coefficients

indbcomp function for comparing independent regression coefficients

tolerance function for multicollinearity assumptions

Mahal function for detecting multivariate outliers

LRchi function for logistic regression likelihood ratio chi square

Pseudo function for Logistic Regression Effect Size

References

R Package Documentation

Browse R Packages

We want your feedback!

chrisaberson/BetterReg
Better Statistics for OLS and Binomial Logistic Regression

JOSS/paper.md
In chrisaberson/BetterReg: Better Statistics for OLS and Binomial Logistic Regression

`part` function for squared semipartial correlations

`R2change` function for addressing improvement in $R^2$ between models

`depbcomp` function for comparing dependent regression coefficients

`indbcomp` function for comparing independent regression coefficients

`tolerance` function for multicollinearity assumptions

`Mahal` function for detecting multivariate outliers

`LRchi` function for logistic regression likelihood ratio chi square

`Pseudo` function for Logistic Regression Effect Size