ez.lms | R Documentation |
lm(y~x+covar), for many y and/or many x
ez.lms(
df,
y,
x,
covar = NULL,
by = NULL,
report = T,
model = c("lm", "lmrob", "lmRob", "rlm"),
view = F,
plot = F,
pmethods = c("bonferroni", "fdr"),
cols = 3,
point.size = 10,
point.shape = 16,
lab.size = 18,
text.size = 16,
error = T,
pe = F,
...
)
ez.regressions(
df,
y,
x,
covar = NULL,
by = NULL,
report = T,
model = c("lm", "lmrob", "lmRob", "rlm"),
view = F,
plot = F,
pmethods = c("bonferroni", "fdr"),
cols = 3,
point.size = 10,
point.shape = 16,
lab.size = 18,
text.size = 16,
error = T,
pe = F,
...
)
df |
a data frame. Internally go through dropna –> ez.2value –> scale
|
y |
compatible with |
x |
compatible with |
covar |
NULL=no covar, compatible with |
by |
NULL. if specified with a factor col, do the model on all subjects, and groups in that col separately. In this case returns a list of result data frame |
report |
print results (in APA format) |
model |
vector c('lm', 'lmrob', 'lmRob', 'rlm'), robustbase::lmrob–MM-type Estimators; robust::lmRob–automatically chooses an appropriate algorithm. one or more, 'lm' will always be included internally, even if not specified |
view |
call View(result) |
plot |
T/F, the black dash line is bonferroni p = 0.05 (again for tests only with a non-NA p values), the grey black dash is uncorrected p = 0.05 |
pmethods |
c('bonferroni','fdr'), type p.adjust.methods for all methods. This correction is based on the total number of tests that successfully generate p values |
cols |
number of columns for multiplot. NULL=auto calculate |
error |
whether show error message when error occurs (also result will have an empty row when error occurs) |
... |
additional parameters to the specified model. if more than one model specified, ... may not be OK for all models, because diff models have diff parameters |
an invisible data frame or list of data frame (if many y and many x)
beta: standardized beta coefficients (simple or multiple regression) are the estimates resulting from a regression analysis that have been standardized
so that the variances of dependent and independent variables are 1.
Therefore, standardized coefficients refer to how many standard deviations a dependent variable will change,
per standard deviation increase in the predictor variable.
For simple regression (1 y ~ 1 x), the value of the standardized coefficient (beta) equals the correlation coefficient (r) (beta=r).
For multiple regression (with covar), the value of the standardized coefficient (beta) is close to semi-partial correlation
According to jerry testing, scale() or not for x,y or covar, does not change p values for predictors, although intercept would differ
dof: from F-statistic
residualization: say, y ~ x + a + b, first x ~ a + b is residualized and then y ~ x (ie, semi-partial). If no covar, y ~ x, although labelled residualized in result data frame, actually non-residualized, because nothing to residualize
as far as semi-partial concerned, y ~ x + a + b, x ~ y + a + b are two different models. ppcor::spcor.test(iris[,1],iris[,2],iris[,c(3,4)]) != ppcor::spcor.test(iris[,2],iris[,1],iris[,c(3,4)])
but for partial correlation, partial(y,x) is the same as partial(x,y), ppcor::pcor.test(iris[,1],iris[,2],iris[,c(3,4)]) = ppcor::pcor.test(iris[,2],iris[,1],iris[,c(3,4)])
On a related note, y ~ x + a + b, generates the same results as y ~ b + a + x (the order does not matter).
For coding purpose, I put x as x, and (a,b) as cov. stdbeta, p returned refer to x in MLR (multiple linear regression), values referring to a,b were discarded during the process.
++++++++++!!!important note!!!++++++++++
"n" | "n.lmrob" | "n.lmRob" | "n.rlm"
"stdbeta" | "stdbeta.lmrob" | "stdbeta.lmRob" | "stdbeta.rlm"
"p" | "p.lmrob" | "p.lmRob" | "p.rlm"
"r2" | "r2.lmrob" | "r2.lmRob" | "r2.rlm" (NA)
"r.spartial" | "r.spartial.lmrob" | "r.spartial.lmRob" | "r.spartial.rlm" (NA)
"p.spartial" | "p.spartial.lmrob" | "p.spartial.lmRob" | "p.spartial.rlm"
"r.partial" | "r.partial.lmrob" | "r.partial.lmRob" | "r.partial.rlm" (NA)
"p.partial" | "p.partial.lmrob" | "p.partial.lmRob" | "p.partial.rlm"
Summary: The absolute value of the semipartial correlation of X with Y is always less than or equal to that of the partial correlation of X with Y.
MLR p = ppcor::pcor.test ~=(>?) p.partial = (y.residual, x.residual)
ppcor::spcor.test ~=(>?) p.spartial = (y, x.residual)
? might be larger, because after residualized, no covar, so dof is larger ?
the stdbeta, p(.lm), p.lmrob etc in result data frame refer to stdbeta, p value for x in MLR, which are plotted when plot=T. the bestp is also selected based on this p value
According to my own demo (see examples iris), MLR p is the same as p values from ppcor::pcor.test and SPSS Analyze->Correlate->Partial! also very close to the p value from manual calculation with correlation(y.residual, x.residual). partial r the same in all cases.
the r.spartial, p.spartial refers to semi-partial correlation (r.spartial is the same as ppcor::spcor.test result, p.spartial very close to ppcor::spcor.test result.
no column named r, r(.lm), r.lmrob etc in the result data frame
r2.rlm or r.spartial.rlm are NA, but p values are available, because I do not know how to get them from rlm yet
To keep consistent with other R functions (eg, lm which converts numeric/non-numeric factor to values starting from 0), set start.at=0 in ez.2value(), then factor(1:2)->c(0,1), factor(c('girl','boy'))->c(1,0) # the level order is boy,girl
in lm() the coding (0,1) vs.(1,2) does not affect slope, but changes intercept (but a coding from 1,2->1,3 would change slope–interval difference matters)
y=ez.zresidize(iris,'Sepal.Length','Petal.Width',scale=F)
x=ez.zresidize(iris,'Sepal.Width','Petal.Width',scale=F)
z=data.frame(Sepal.Length=y$Sepal.Length,Sepal.Width=x$Sepal.Width)
ez.lms(z,'Sepal.Length','Sepal.Width')
ppcor::pcor.test(iris$Sepal.Length,iris$Sepal.Width,iris$Petal.Width)
ppcor::spcor.test(iris$Sepal.Length,iris$Sepal.Width,iris$Petal.Width)
y = c(1,2,3,4,5,6)
x = c(2,4,15,20,25,36)
z=factor(c('m','m','m','f','f','f'))
lm(y~x+z) %>% summary
lm(y~x+ez.2value(z)) %>% summary
lm(y~x+ez.2value(z,start.at=2)) %>% summary
# same beta, p but diff Intercept
y = c(1,2,3,4,5,6)
x = c(2,4,15,20,25,36)
f=factor(c('m','m','u','u','f','f'))
lm(y~x+f) %>% summary
lm(y~x+ez.2value(f)) %>% summary
lm(y~x+ez.2value(f,start.at=2)) %>% summary
# the first is diff from the rest, changes p value
y = c(1,2,3,4,5,6)
x = c(2,4,15,20,25,36)
z=c(0,0,0,1,1,1)
zz=c(-1,-1,-1,1,1,1)
zzz=c(-3,-3,-3,1,1,1)
lm(y~x+z) %>% summary
lm(y~x+scale(z)) %>% summary
lm(y~x+zz) %>% summary
lm(y~x+zzz) %>% summary()
# again, scale and different coding strategy only change intercept and beta(beta?, that is the purpose!), but not p
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.