mylm
packagelibrary(car) data(SLID, package = "carData") SLID <- SLID[complete.cases(SLID),]
The parameter estimates are calculated by using the maximum likelihood estimator: $$ \hat{\mathbf{\beta}} = (X^TX)^{-1}X^T\mathbf{Y}. $$
This estimator has the distribution $\hat{\beta}\sim N_{p}(\beta,\sigma^2({\bf X}^T{\bf X})^{-1})$.
This parameter estimation is implemented in the mylm
module and the results can be shown with print.mylm
on the resulting model object.
library(mylm) model1 <- mylm(wages ~ education, data = SLID) print(model1)
We can compare these results with the lm
module in R's standard library
model1b <- lm(wages ~ education, data = SLID) print(model1b)
The results are numerically identical, as intended.
The restricted maximum likelihood estimator for $\sigma^2$, denoted as $\hat{\sigma}^2$, is
$$ \hat{\sigma}^2=\frac{1}{n-p}({\bf Y}-{\bf X}\hat{\beta})^T({\bf Y}-{\bf X}\hat{\beta})=\frac{\text{SSE}}{n-p}, $$
with the following chi-squared distribution
$$ \frac{(n-p)\hat{\sigma}^2}{\sigma^2} \sim \chi^2_{n-p}. $$
We can therefore estimate the covariance matrix for the parameter estimates as
$$ \widehat{\text{Cov}}(\mathbf{\hat\beta}) = \hat\sigma^2({\bf X}^T{\bf X})^{-1}. $$
We will use the residuals as estimators for the standard errors
$$ \hat{\mathbf{e}} = \mathbf{Y}-X^T\hat{\beta}. $$
Now let's define the following notation
$$ c_{jj} := (X^T X)^{-1}_{jj}, $$
i.e. $c_{jj}$ being the $j$th diagonal element in $(X^T X)^{-1}_{jj}$.
We can now formulate a test statistic for the model parameter estimates
$$ T_j=\frac{\hat{\beta}j-\beta_j}{\sqrt{c{jj}}\hat{\sigma}}\sim t_{n-p}. $$
In our implementation we will approximate this $T_j$ test statistic with the asymptotic distribution
$$ T_j=\frac{\hat{\beta}j-\beta_j}{\sqrt{c{jj}}\hat{\sigma}}\approx N(0,1), $$
and denote it as $Z_j$ from here on.
The distribution of $\hat{\sigma}^2$, it is actually distributed according to a $t$-distribution $n-p$ degrees of freedom. However, when $n$ is large it is reasonable to approximate this distribution by a normal distribution.
The corresponding p-value for the test statistic is the probability of observing the test statistic or a more extreme outcome, i.e. in a world where the null hypothesis holds true, and there is really no dependence of the $j$th covariate on the response $Y$.
This will also make the extension to generalized linear models in the next report simpler.
To obtain the test statistic for each of the parameters, we calculate the test statistic under the null hypothesis that the true parameter value is 0. That is
$$ z_i = \frac{\hat{\beta}i-0}{\hat{\sigma} \sqrt{c{ii}}}, $$
The $p$-values for the parameter estimates can therefore be calculated as
$$ p_i = P(Z_i >|z_i|). $$
The formula for $\widehat{\text{Cov}}(\mathbf{\hat\beta})$ has been implemented in the mylm
package, and can be accessed from the returned model object as follows
model1$covmatrix
By implementing the summary
function, using the formulas explained above, we are able to calculate the estimates and standard errors of the intercept and regression coefficient for this model, and test their significance.
summary(model1)
The interpretation of the parameter estimates is that as the corresponding $x_i$ increases by one, $y$ is expected to increase by $\hat{\beta}_i$.
We see that both z
-values are very large for a standard normal distribution, and correspondingly, the $p$-values are very small, which seems quite reasonable.
We may also compare to the model implemented with the regular lm
function:
summary(model1b)
We see that the results are the same, except for the z-values, which are not supposed to be the same.
We then implement the plot function, in order to plot the fitted values against the residuals.
plot(model1)
It seems that the mean of the residuals is approximately 0 - that is good, as this is one of our assumptions. However, there is asymmetry, as there is a thicker positive tail. We notice a hint of multiplicative residuals, which means the residuals are not independent of the fitted values.
Q: What is the residual sum of squares (SSE) and the degrees of freedom for this model?
A: SSE is r toString(round(model1$SSE, 0))
, with number of degrees of freedom $n-p =$ r length(model1$fitted_values)
$-$ r length(model1$coefficients)
= r length(model1$fitted_values) - length(model1$coefficients)
Q: What is total sum of squares (SST) for this model? Test the significance of the regression using a $\chi^2$-test.
A: SST is r toString(round(model1$SST, 0))
, with p-value of the $\chi^2$-test equal to r model1$p_chi
In order to test the significance of regression we calculate the F-statistic
$$ F = \frac{n-p}{k}\frac{R^2}{1-R^2}, $$ where $R^2 = 1-\frac{SSE}{SST}$ is the coefficient of determination (see next point). Since this has a Fisher distribution with $k$ and $n-p$ degrees of freedom, we know that $k\cdot F$ has a chi squared distribution with $k$ degrees of freedom. This is what we use to calculate the p-value.
Q: What is the relationship between the (\chi^2)- and (z)-statistic in simple linear regression? Find the critical value(s) for both tests.
A: In the simple linear regression case we have that the (\chi^2)-statistic is the squared of the (z)-statistic. This we can also see from the summary
of model1
:
model1$chistatistic model1$zvalues[2]^2
The coefficient of determination $R^2$ gives the proportion of variance in the data explained by the model. We calculate it as mentioned above,
$$ R^2 = 1-\frac{SSE}{SST}, $$
model1$Rsq
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.