eda_lm | R Documentation |
eda_lm
generates a scatter plot with a fitted regression
line. A loess line can also be added to the plot for model comparison. The
axes are scaled such that their respective standard deviations match axes
unit length.
eda_lm(
dat,
x,
y,
xlab = NULL,
ylab = NULL,
px = 1,
py = 1,
tukey = FALSE,
show.par = TRUE,
reg = TRUE,
w = NULL,
sd = TRUE,
mean.l = TRUE,
grey = 0.6,
pch = 21,
p.col = "grey50",
p.fill = "grey80",
size = 0.8,
alpha = 0.8,
q = FALSE,
q.val = c(0.16, 0.84),
q.type = 5,
loe = FALSE,
lm.col = rgb(1, 0.5, 0.5, 0.8),
loe.col = rgb(0.3, 0.3, 1, 1),
stats = FALSE,
stat.size = 0.8,
loess.d = list(family = "symmetric", span = 0.7, degree = 1),
...
)
dat |
Data frame. |
x |
Column assigned to the x axis. |
y |
Column assigned to the y axis. |
xlab |
X label for output plot. |
ylab |
Y label for output plot. |
px |
Power transformation to apply to the x-variable. |
py |
Power transformation to apply to the y-variable. |
tukey |
Boolean determining if a Tukey transformation should be adopted (FALSE adopts a Box-Cox transformation). |
show.par |
Boolean determining if power transformation should be displayed in the plot. |
reg |
Boolean indicating whether a least squares regression line should be plotted. |
w |
Weight to pass to regression model. |
sd |
Boolean determining if standard deviation lines should be plotted. |
mean.l |
Boolean determining if the x and y mean lines should be added to the plot. |
grey |
Grey level to apply to plot elements (0 to 1 with 1 = black). |
pch |
Point symbol type. |
p.col |
Color for point symbol. |
p.fill |
Point fill color passed to |
size |
Point size (0-1). |
alpha |
Point transparency (0 = transparent, 1 = opaque). Only
applicable if |
q |
Boolean determining if grey quantile boxes should be plotted. |
q.val |
F-values to use to define the quantile box parameters. Defaults to mid 68 are used to generate the box. |
q.type |
Quantile type. Defaults to 5 (Cleveland's f-quantile definition). |
loe |
Boolean indicating if a loess curve should be fitted. |
lm.col |
Regression line color. |
loe.col |
LOESS curve color. |
stats |
Boolean indicating if regression summary statistics should be displayed. |
stat.size |
Text size of stats output in plot. |
loess.d |
A list of parameters passed to the |
... |
Not used. |
The function will plot an OLS regression line and, if requested, a
loess fit. The plot will also display the +/- 1 standard deviations as
dashed lines. In theory, if both x and y values follow a perfectly Normal
distribution, roughly 68 percent of the points should fall in between these
lines.
The true 68 percent of values can be displayed as grey rectangles by setting
q=TRUE
. This uses the quantile
function to compute the upper
and lower bounds defining the inner 68 percent of values. If the data
follow a Normal distribution, the grey rectangle edges should coincide
with the +/- 1SD dashed lines.
If you wish to show the interquartlie ranges (IQR) instead of the inner
68 percent of values, simply set q.val = c(0.25,0.75)
.
The plot has the option to re-express the values via the px
and
py
arguments. But note that if the re-expression produces NaN
values (such as if a negative value is logged) those points will be removed
from the plot. This will result in fewer observations being plotted. If
observations are removed as result of a re-expression a warning message
will be displayed in the console.
The re-expression powers are shown in the upper right side of the plot.
To suppress the display of the re-expressions set show.par = FALSE
.
Returns residuals, intercept and slope from an OLS fit.
residuals
: Regression model residuals
a
: Intercept
b
: Slope
plot
and loess.smooth
functions
# Add a regular (OLS) regression model and loess smooth to the data
eda_lm(mtcars, wt, mpg, loe = TRUE)
# Add the inner 68% quantile to compare the true 68% of data to the SD
eda_lm(mtcars, wt, mpg, loe = TRUE, q = TRUE)
# Show the IQR box
eda_lm(mtcars, wt, mpg, loe = TRUE, q = TRUE, sd = FALSE, q.val = c(0.25,0.75))
# Fit an OLS to the Income for Female vs Male
df2 <- read.csv("https://mgimond.github.io/ES218/Data/Income_education.csv")
eda_lm(df2, x=B20004013, y = B20004007, xlab = "Female", ylab = "Male",
loe = TRUE)
# Add the inner 68% quantile to compare the true 68% of data to the SD
eda_lm(df2, x = B20004013, y = B20004007, xlab = "Female", ylab = "Male",
q = TRUE)
# Apply a transformation to x and y axes: x -> 1/3 and y -> log
eda_lm(df2, x = B20004013, y = B20004007, xlab = "Female", ylab = "Male",
px = 1/3, py = 0, q = TRUE, loe = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.