Chris Aberson May 25, 2022
This package provides tools for statistics that are not provided in base R packages for linear regression and logistic regression. Functions provide squared semi-partial correlations, tolerance, Mahalanobis Distances, Likelihood Ratio Chi-square, and Pseudo R-square.
I built this under R 4.2.0
car (>= 3.0-0), stats (>= 3.5.0), dplyr (>= 0.8.0)
Please post issues using the link above (titled "issues"). Those interested in contributing to further development should create a pull request. For feature requests, submit an issue.
This project is licensed under GNU General Public License version 3.
The package is available on CRAN. Use the command install.packages("BetterReg"). This respository will usually include a beta version for the next release. To install from GitHub - in R issue the command below.
devtools::install_github("chrisaberson/BetterReg")
Using data from Aberson (2007), the analyses that follow predict support for Affirmative Action (AA) in year 4 of college from incomming attitudes, personal experience with discrimination, liberality, gender, and economic concern for the future. In a later example, I add perceptions of the prevalence of discrimination, endorsement of meritocracy, and partipation in campus diversity events.
The part
function requires an existing LM model and indication of number of predictors.
First, build the model:
xx<-lm(formula = AA_DV ~ AA_Initial+pers_exp+liberal+female+economic, data = hand5)
Then, using the parts command, provide the model name and the number of predictors
parts(model=xx, pred=5)
##
Predictor 1: semi partial = 0.333; squared semipartial = 0.111
Predictor 2: semi partial = 0.032; squared semipartial = 0.001
Predictor 3: semi partial = 0.197; squared semipartial = 0.039
Predictor 4: semi partial = 0.095; squared semipartial = 0.009
Predictor 5: semi partial = 0.032; squared semipartial = 0.001
The Mahal
function provides Mahalanobis values and requires input of model and predictors as well as the number of values to return (10 is the default).
Mahal(model=xx, pred=5, values=10)
##
60 247 639 703 133 157 129 431 24 655
11.08698 11.08698 11.77189 11.93773 12.34983 13.68620 14.15117 14.50515 14.72140 15.27508
The tolerance
command requires only the model name.
tolerance(model=xx)
##
AA_Initial pers_exp liberal female economic
0.9464682 0.9904156 0.9058256 0
9418196 0.9910559
R2change
compares two models. Below, I added merit, discrimination, and diversity participation to the model (xx2).
the R2change command takes model1 (xx) and compares it to model2 (xx2). Note that this approach is only for models that are adding variables to a previous model.
xx2<-lm(formula = AA_DV ~ AA_Initial+pers_exp+liberal+female+economic+merit+discrim+div_part, data = hand5)
R2change(model1=xx, model2=xx2)
##
R-square change = 0.181
F(3,704) = 70.537, p = 6.81538788796511e-40
The depbcomp
function allows for comparisons of dependent coefficients. These are coefficients in the same model.
depbcomp(data=sample1,y="AA_DV",x1="div_part",x2="merit", x3="discrim",numpred=3,comps="abs")
##
Pred 1 vs. Pred 2 : t = 4.633, p = 4.28730081880602e-06
Pred 1 vs. Pred 3 : t = 9.614, p = 0
Pred 2 vs. Pred 3 : t = 5.371, p = 1.0627416191511e-07
The indbcomp
function compares predictors from two (identical) model. Note that the model object should be a summary of the model.
model1<-summary(lm(AA_DV~ div_part+merit+ discrim, data=hand5))
model2<-summary(lm(AA_DV~ div_part+merit+ discrim, data=sample2))
indbcomp(model1=model1, model2=model2, pred=3, comp="abs")
##
Predictor 1: t = 110.812, p = 0
Predictor 2: t = 13.623, p = 0
Predictor 3: t = 23.958, p = 0
In this example, taken from Cohen, Cohen, West, and Aiken (2015), women's compliance with mammography recommendations (i.e., yearly screening in this case) is predicted from physicians recommendation, knowledge of mammography, perceived barriers, and perceived benifits.
First, run a logistic regression model.
Model4<-glm(comply~physrec+knowledg+benefits+barriers, data=logistic2, family = binomial())
The LRchi
command requires the name of the dataset, definition of all variables in mode (y, x1, x2, etc.), and the number of model predictors.
LRchi(data=logistic2, y="comply",x1="physrec", x2="knowledg", x3="benefits",x4="barriers", numpred=4)
##
Predictor: physrec; LR squared 16.67, p= 0
Predictor: knowledg; LR squared 0.01, p= 0.94
Predictor: benefits; LR squared 5.29, p= 0.02
Predictor: barriers; LR squared 13.77, p= 0
The pseudo
function requires only an existing model as input.
pseudo(model = Model4)
##
Likelihood Ratio R-squared (McFadden, Recommended) = 0.26
Cox-Snell R-squared = 0.301
Nagelkerk R-squared = 0.402
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.