Package moonBook
is avaiable on CRAN and github. Package moonBook2
is available only on github. Please install moonBook2 package using the following R code.
install.packages("devtools") devtools::install_github("cardiomoon/moonBook") devtools::install_github("cardiomoon/moonBook2")
In this vignette, I will show you how to visualize the interaction between two predictor variables.
In ANCOVA model, there are one categorical variable with one continuous variable(covariate). With mtcars
data, we can make a ANCOVA model in which the mpg(mile per gallon) is a response variable and the wt(weight) and the number of cylinders(cyl) are explanatory variables. The number of cylinders are recorded as number. To make a ANCOVA model, we make a new categorical variable cyl1.
mtcars$engine=factor(mtcars$cyl)
You can easily calculate the mean and SD of mpg by engine.
require(moonBook) require(moonBook2) require(ggplot2) summarySE(mtcars,"mpg","engine")
The summarySE() function is made by slight modification of summarySE() function in "R graphics cookbook" by Winston Chang. You can make plot comparing the average of three groups.
str(mtcars) ggCatepillar(mpg~engine,data=mtcars,interactive=TRUE)
fit=lm(mpg~wt+engine,data=mtcars) summary(fit)
In summary, we can obtain regression equation for engine 4,6 and 8.
engine4 : mpg = -3.2056*wt+33.9908 engine6 : mpg = -3.2056*wt+33.9908-4.2556 engine8 : mpg = -3.2056*wt+33.9908-6.0709
We can easly visualize this model with ggAncova() function.
ggAncova(fit,interactive=TRUE)
ggAncova(mpg~wt+engine,data=mtcars,interactive=TRUE)
If you are not familiar with the formula, You can enter data and variables in order. If you enter two continuous variables as explanatory variables, the last one is converted into categorical variable and make the ANCOVA plot.
ggAncova(mtcars,mpg,wt,cyl,interactive=TRUE)
In radial data in moonBook package, the degree of atherosclerosis are measured as the normalized total atheroma volume(NTAV). With this data we can make an ANCOVA model in which age and sex are used as explanatory variables.
ggAncova(NTAV~age+sex,data=radial,interactive=TRUE)
As you can see, the NTAV is increasing with age in both sex, but men have larger NTAV thae women.
One of the most interesting research findings are those involving interactions among predictor variables. If you are interested in th impact of horse power and automoble weight on mileage, you can fit a regression model with the following formula.
fit2=lm(mpg~hp*wt,data=mtcars) summary(fit2)
In formula, A*B is interpreted as A + B + A:B. A:B is the interaction of A and B. The relationship between mileage(mpg) and horse power(hp) varies by car weight(wt). The quantile of weight are as follows.
summary(mtcars$wt)
The minimal value of weight is 1.513 and th maimal value of weight is 5.424. The regression equation between mpg and wt varies with hp. If the coefficients were rounded to the decimal point two digits, the regression equation is as follows
mpg=49.81 - 0.12 * hp- 8.22 * wt + 0.03 * hp * wt
If the weight is 3.2(mean), the regression equation will be:
mpg=49.81-0.12 * hp - 8.22 * 3.2 + 0.03 * hp * 3.2 mpg=23.51 - 0.03 * hp
You can calculate the regression slope and y intercept with the mean of weight(3.2) and one standard deviation below and above the maen(2.2,and 4.2 respectively) as follows:
A<-c(2.2,3.2,4.2) coef<-fit2$coef labels=as.character(A) intercept=coef[1]+coef[2]*A slope=coef[3]+coef[4]*A intercept slope
With ggEffect() function, you can visualize this interaction easily.
ggEffect(mpg~hp*wt,data=mtcars,interactive=TRUE)
By default, three regression lines are displayed at c(0.10,0.5,0.9) percentiles. You can get other regression lines by changing the probs parameter.
ggEffect(mpg~hp*wt,data=mtcars,probs=c(0.2,0.4,0.6,0.8),interactive=TRUE)
You can get regression lines with the desired value. If you wanted to get regression lines with weight 2.2,3.2 and 4.2, try this :
ggEffect(mpg~hp*wt,data=mtcars,xvalue=c(2.2,3.2,4.2),interactive=TRUE)
You can use the ggEffect() function with categorical variables. You can see the impact of intweraction between aging and smoking status on the progression of atherosclerosis, try this:
ggEffect(NTAV~age*smoking,data=radial,interactive=TRUE) ggEffect(NTAV~age*DM,data=radial,interactive=TRUE)
You can clearly see that the atheroscerosis progresses rapidly among smokers.
Another interesting examples comes from acs data(moonBook package). In acs data, 857 patients with acute coronary syndrome(acs) were enrolled. If you are interested the age of disease onset and diabetes and smoking status, you can summarize the onset of age grouped by smoking and DM status as follows.
summarySE(acs,"age",c("DM","smoking"))
You can visualize this as follows:
ggEffect(age~DM*smoking,data=acs,interactive=TRUE) ggEffect(age~DM*smoking,data=acs,no=2,interactive=TRUE)
You can perform two-way ANOVA with interaction.
fit=aov(age~DM*smoking,data=acs) summary(fit)
In the two-way ANOVA, the impact of smoking status is significant, whereas the impacts of DM status and the interaction between DM and smoking are insignificant. You can perform multiple comparison by computinig Tukey Honest Significant Differences.
result=TukeyHSD(fit) result
You can visualize the result with ggHSD() function. Because the result of HSD test in thsi case is a list of length 3, you can select the second list.
ggHSD(result,2,interactive=TRUE)
Followings are othe examples of ggEffect()
ggEffect(age~Dx*sex,data=acs,interactive=TRUE)
As you can see, male patient with acs are younger than female patient with acs
This plot is identical with the following plot
ggCatepillar(age~Dx+sex,data=acs,interactive=TRUE)
The static plots(default: interactive=FALSE) are standard ggplots. You can save the plots with ggsave() function. You can make a static plot interactive by ggiraph() function. Interactive plots are htmlwidgets. You can save the plot with htmlwidgets::saveWidget() function.
p<-ggCatepillar(age~Dx+sex,data=acs) ggsave("ggCatepillar.pdf",p) p1<-ggiraph(code=print(p)) require(htmlwidgets) saveWidget(p1,"ggCatepillar.html")
Alternatively, you can make a interactive plot by setting the parameter interactive=TRUE. In this case, you can save as html file with saveWidget() function.
p2<-ggEffect(NTAV~age*smoking,data=radial,interactive=TRUE) saveWidget(p2,"ggEffect.html")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.