To reproduce this document, you have to install R package ggiraphExtra from github. install.packages("devtools") devtools::install_github("cardiomoon/ggiraphExtra")
In univariate regression model, you can use scatter plot to visualize model. For example, you can make simple linear regression model with data radial
included in package moonBook. The radial data contains demographic data and laboratory data of 115 patients performing IVUS(intravascular ultrasound) examination of a radial artery after tansradial coronary angiography. The NTAV(normalized total atheroma volume measured by intravascular ultrasound(IVUS) in cubic mm) is a quantitative measurement of atherosclerosis. Suppose you want to predict the amount of atherosclerosis(NTAV) from age.
require(moonBook) # for use of data radial fit=lm(NTAV~age,data=radial) summary(fit)
You can get the regression equation from summary of regression model:
y=`r round(coef(fit)[2],2)`*x+`r round(coef(fit)[1],2)`
You can visualize this model easily with ggplot2 package.
require(ggplot2) ggplot(radial,aes(y=NTAV,x=age))+geom_point()+geom_smooth(method="lm")
You can make interactive plot easily with ggPredict() function included in ggiraphExtra package.
require(ggiraph) require(ggiraphExtra) require(plyr) ggPredict(fit,se=TRUE,interactive=TRUE)
With this plot, you can identify the points and see the regression equation with your mouse.
You can make a regression model with two predictor variables. Now you can use age and sex as predictor variables.
fit1=lm(NTAV~age+sex,data=radial) summary(fit1)
From the result of regression analysis, you can get regression regression equations of female and male patients :
For female patient, y=`r round(coef(fit1)[2],2)`*x+`r round(coef(fit1)[1],2)` For male patient, y=`r round(coef(fit1)[2],2)`*x+`r round(coef(fit1)[1],2)+round(coef(fit1)[3],2)`
You can visualize this model with ggplot2 package.
equation1=function(x){coef(fit1)[2]*x+coef(fit1)[1]} equation2=function(x){coef(fit1)[2]*x+coef(fit1)[1]+coef(fit1)[3]} ggplot(radial,aes(y=NTAV,x=age,color=sex))+geom_point()+ stat_function(fun=equation1,geom="line",color=scales::hue_pal()(2)[1])+ stat_function(fun=equation2,geom="line",color=scales::hue_pal()(2)[2])
You can make interactive plot easily with ggPredict() function included in ggiraphExtra package.
ggPredict(fit1,se=TRUE,interactive=TRUE)
You can make a regession model with two predictor variables with interaction. Now you can use age and DM(diabetes mellitus) and interaction between age and DM as predcitor variables.
fit2=lm(NTAV~age*DM,data=radial) summary(fit2)
The regression equation in this model are as follows: For patients without DM(DM=0), the intercept is 49.65 and the slope is 0.29. For patients with DM(DM=1), the intercept is 49.65-20.86 and the slope is 0.29+0.35.
For patients without DM(DM=0), y=`r round(coef(fit2)[2],2)`*x+`r round(coef(fit2)[1],2)` For patients without DM(DM=1), y=`r round((coef(fit2)[2]+coef(fit2)[4]),2)`*x+`r round((coef(fit2)[1]+coef(fit2)[3]),2)`
You can visualize this model with ggplot2.
ggplot(radial,aes(y=NTAV,x=age,color=factor(DM)))+geom_point()+stat_smooth(method="lm",se=FALSE)
You can make interactive plot easily with ggPredict() function included in ggiraphExtra package.
ggPredict(fit2,colorAsFactor = TRUE,interactive=TRUE)
You can make a regession model with two continuous predictor variables. Now you can use age and weight(body weight in kilogram) as predcitor variables.
fit3=lm(NTAV~age*weight,data=radial) summary(fit3)
From the analysis, you can get the regression equation for a patient with body weight 40kg, the intercept is 37.61+(-0.10416)*40 and the slope is -0.33+0.01468*40
For bodyweight 40kg, y=`r round(coef(fit3)[2]+coef(fit3)[4]*40,2)`*x+`r round(coef(fit3)[1]+coef(fit3)[3]*40,2)` For bodyweight 50kg, y=`r round(coef(fit3)[2]+coef(fit3)[4]*50,2)`*x+`r round(coef(fit3)[1]+coef(fit3)[3]*50,2)` For bodyweight 90kg, y=`r round(coef(fit3)[2]+coef(fit3)[4]*90,2)`*x+`r round(coef(fit3)[1]+coef(fit3)[3]*90,2)`
To visualize this model, the simple ggplot command shows only one regression line.
ggplot(radial,aes(y=NTAV,x=age,color=weight))+geom_point()+stat_smooth(method="lm",se=FALSE)
You can easily show this model with ggPredict() function.
ggPredict(fit3,interactive=TRUE)
You can make a regession model with three predictor variables. Now you can use age and weight(body weight in kilogram) and HBP(hypertension) as predcitor variables.
fit4=lm(NTAV~age*weight*HBP,data=radial) summary(fit4)
From the analysis result, you can get the regression equation for a patient without hypertension(HBP=0) and body weight 60kg: the intercept is 64.12+(-0.39685*60) and the slope is -0.67650+(0.01686*60). The equation for a patient with hypertension(HBP=1) and same body weight: the intercept is 64.12+(-0.39685*60-101.94) and the slope is -0.67650+(0.01686*60)+1.27972+(-001666*60).
To visualize this model, you can make a faceted plot with ggPredict() function. You can see the regression equation of each subset with hovering your mouse on the regression lines.
ggPredict(fit4,interactive = TRUE)
You can use glm() function to make a logistic regression model. The GBSG2 data in package TH.data contains data from German Breast Cancer Study Group 2. Suppose you want to predict survival with number of positive nodes and hormonal therapy.
require(TH.data) # for use data GBSG2 fit5=glm(cens~pnodes*horTh,data=GBSG2,family=binomial) summary(fit5)
You can easily visualize this model with ggPredict() function.
ggPredict(fit5,se=TRUE,interactive=TRUE,digits=3)
You can make multiple logistic regression model with no interaction between predictor variables.
fit6=glm(cens~pnodes+horTh,data=GBSG2,family=binomial) summary(fit6)
ggPredict(fit6,se=TRUE,interactive=TRUE,digits=3)
You can make multiple logistic regression model with two continuous variables with interaction.
fit7=glm(cens~pnodes*age,data=GBSG2,family=binomial) summary(fit7)
ggPredict(fit7,interactive=TRUE)
You can adjust the number of regression lines with parameter colorn. For example you can draw 100 regression lines with following R command.
ggPredict(fit7,interactive=TRUE,colorn=100,jitter=FALSE)
You can make multiple logistic regression model with three predictor variables.
fit8=glm(cens~pnodes*age*horTh,data=GBSG2,family=binomial) summary(fit8)
ggPredict(fit8,interactive=TRUE,colorn=100,jitter=FALSE)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.