MoE_mahala | R Documentation |
Computes the Mahalanobis distance between the response variable(s) and the fitted values of linear regression models with multivariate or univariate responses.
MoE_mahala(fit, resids, squared = FALSE, identity = NULL)
fit |
A fitted |
resids |
The residuals. Can be residuals for observations included in the model, or residuals arising from predictions on unseen data. Must be coercible to a matrix with the number of columns being the number of response variables. Missing values are not allowed. |
squared |
A logical. By default ( |
identity |
A logical indicating whether the identity matrix is used in place of the precision matrix in the Mahalanobis distance calculation. Defaults to |
A vector giving the Mahalanobis distance (or squared Mahalanobis distance) between response(s) and fitted values for each observation.
Keefe Murphy - <keefe.murphy@mu.ie>
data(ais) hema <- as.matrix(ais[,3:7]) mod <- lm(hema ~ sex + BMI, data=ais) res <- hema - predict(mod) MoE_mahala(mod, res, squared=TRUE) data(CO2data) CO2 <- CO2data$CO2 GNP <- CO2data$GNP mod2 <- lm(CO2 ~ GNP, data=CO2data) pred <- predict(mod2) res2 <- CO2 - pred maha <- MoE_mahala(mod2, res2) # Highlight outlying observations plot(GNP, CO2, type="n", ylab=expression('CO'[2])) lines(GNP, pred, col="red") points(GNP, CO2, cex=maha, lwd=2) text(GNP, CO2, col="blue", labels=replace(as.character(CO2data$country), maha < 1, "")) # Replicate initialisation strategy using 2 randomly chosen components # Repeat the random initialisation if necessary # (until 'crit' at convergence is minimised) G <- 3L z <- sample(seq_len(G), nrow(CO2data), replace=TRUE) old <- Inf crit <- .Machine$double.xmax while(crit < old) { Sys.sleep(1) old <- crit maha <- NULL plot(GNP, CO2, type="n", ylab=expression('CO'[2])) for(g in seq_len(G)) { ind <- which(z == g) mod <- lm(CO2 ~ GNP, data=CO2data, sub=ind) pred <- predict(mod, newdata=CO2data[,"CO2", drop=FALSE]) maha <- cbind(maha, MoE_mahala(mod, CO2 - pred)) lines(GNP, pred, col=g + 1L) } min.M <- rowMins(maha) crit <- sum(min.M) z <- max.col(maha == min.M) points(GNP, CO2, cex=min.M, lwd=2, col=z + 1L) text(GNP, CO2, col=z + 1L, labels=replace(as.character(CO2data$country), which(min.M <= 1), "")) } crit
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.