library(mylm) library(car) data(SLID, package = "carData") SLID <- SLID[complete.cases(SLID),]
The theory mentioned earlier has already been provided in vector notation, and thus applies to both simple and multiple linear regression. Here we show the functions applied on a model with multiple covariates.
model1 <- mylm(wages ~ education, data = SLID) model2 <- mylm(wages ~ education + age, data = SLID) print(model2)
summary(model2)
The estimated parameters for the multiple linear regression case are calculated as described in part 1. For the model with education
and age
as predictors for the response wage
, the estimates for the coefficients are given in the summary above. They are r model2$coefficients[2]
and r model2$coefficients[3]
for education
and age
respectively.
With a similar approach as described in part 2, we calculate the observed $z$-statistic for each of the coefficients under the null hypothesis that the true parameter values are 0. The test statistics can be found in the summary, and the p-values for both the coefficients, and the intercept, are highly significant. Hence, we can with some confidence say that both education
and age
have some predictive power with respect to the wage
.
The interpretation of the parameters are simply that as one of the covariates increase by one unit, the expected wage increase by the value of the coefficient. This means that as a person grows one year older, he/she is expected to increase his/her income by r model2$coefficients[3]
. Similarly, as a person takes one year with education he/she is expected to grow his/her income by r model2$coefficients[2]
.
In order to investigate the effect on age
and education
both separately, and together, we fit three models - one with only age
, one with only education
and one with both. We begin by looking at the estimated coefficients in all three models again.
print(model1) print(model2) model3 = mylm(wages~age, data = SLID) print(model3)
We observe that the parameter estimates differ, due to the fact that the covariates are dependent of each other. This we also observe by studying the covariance matrix for the estimated coefficients in the model with both age
and education
, where the elements off the diagonal are non-zero. It means that as one of the covariates varies, it will affect the other covariates that it depends on.
On the other hand, if the covariates were completely independent, then the coefficients would remain the same, even if another covariate was added or removed. This is not the case here, which is quite intuitive considering that as one grows older one would typically have more education - and the other way around.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.