knitr::opts_chunk$set(fig.height=3)
## Do not delete this! ## It loads the s20x library for you. If you delete it ## your document may not compile library(s20x)
A study was carried out into the mortality of people who have sepsis. Sepsis is a life-threatening condition that arises when the body's response to infection causes injury to its own tissues and organs. We wish to see if their mortality increases depending on how healthy they are when they are admitted to an intensive care unit (ICU). In order to measure their health we will use a diagnostic measurement called APACHE II.
APACHE II ("Acute Physiology And Chronic Health Evaluation II") is a severity-of-disease classification system (Knaus et al., 1985), one of several ICU scoring systems. It is applied within 24 hours of admission of a patient to an ICU. An integer score from 0 to 71 is computed based on several measurements; higher scores correspond to more severe disease and a higher risk of death.
A sample of septic patients admitted to ICU’s was analysed.
The resulting grouped data is stored in Apache.csv which contains the following variables:
Variable | Description -------------------|-------------------------------------------------------------------------- apache_score | The APACHE II score for patients in the group. number_of_patients | The number of patients in the group. deaths | The number of patients in the group who died within 30 days of being admitted to ICU.
It was of interest to see how increasing APACHE II scores related to the mortality rate. In particular, to assess the effect of a 1 point or 5 point difference in APACHE II scores and to estimate mortality rates for APACHE II scores of 10, 20, 30 and 40.
Instructions:
It was of interest to see how increasing APACHE II scores related to the mortality rate. In particular, to assess the effect of a 1 point or 5 point difference in APACHE II scores and to estimate mortality rates for APACHE II scores of 10, 20, 30 and 40.
load(system.file("extdata", "Apache.df.rda", package = "s20x"))
Apache.df=read.csv("Apache.csv",header=TRUE) plot(deaths/number_of_patients~apache_score, ylab="prob. of death", xlab="APACHE II score",data=Apache.df) head(Apache.df) tail(Apache.df,10)
plot(deaths/number_of_patients~apache_score, ylab="prob. of death", xlab="APACHE II score",data=Apache.df) head(Apache.df) tail(Apache.df,10)
The scatter plot clearly shows that the probability of death increase as the APACHE II score increases. The patient with the highest APACHE II score (41) survived, while all the patients with scores between 31 and 40 died.
fit=glm(I(deaths/number_of_patients)~apache_score,weights=number_of_patients,family=binomial,data=Apache.df) plot(fit,which=1) p.value = 1 - pchisq(summary(fit)$deviance, summary(fit)$df.residual) p.value summary(fit) confint(fit) exp(confint(fit)) exp(confint(fit)[2,]) 100*(exp(confint(fit)[2,])-1) exp(confint(fit)[2,]*5) 100*(exp(confint(fit)[2,]*5)-1)
pred.df=data.frame(apache_score = c(10,20,30,40)) predictGLM(fit, newdata=pred.df, type="response")
conf1 = data.frame(predictGLM(fit, newdata=pred.df, type="response")) resultStr1 = paste0(sprintf("%.2f", abs(conf1$lwr)), " and ", sprintf("%.2f", abs(conf1$upr))) conf2 = t(exp(confint(fit)[2,])) conf2 = data.frame(conf2) names(conf2) = c("lower", "upper") resultStr2 = paste0(sprintf("%.2f", abs(conf2$lower)), " and ", sprintf("%.2f", abs(conf2$upper))) conf3 = t(100*(exp(confint(fit)[2,])-1)) conf3 = data.frame(conf3) names(conf3) = c("lower", "upper") resultStr3 = paste0(sprintf("%.0f%%", abs(conf3$lower)), " and ", sprintf("%.0f%%", abs(conf3$upper))) conf4 = t(exp(confint(fit)[2,]*5)) conf4 = data.frame(conf4) names(conf4) = c("lower", "upper") resultStr4 = paste0(sprintf("%.2f", abs(conf4$lower)), " and ", sprintf("%.2f", abs(conf4$upper))) conf5 = t((100*(exp(confint(fit)[2,]*5)-1))) conf5 = data.frame(conf5) names(conf5) = c("lower", "upper") resultStr5 = paste0(sprintf("%.0f%%", abs(conf5$lower)), " and ", sprintf("%.0f%%", abs(conf5$upper)))
We fitted a logistic regression to the APACHE II data as we had a binary response variable (whether or not the patient died). We had one explanatory variable, the APACHE II score, which was significant.
There was no evidence of overdispersion and the residual plot didn't reveal any major problems. (Observation 38 is unusual, but not too extreme.)
The fitted model is approximately $\log(odds_i) = \beta_0 + \beta_1 \times apache_score_i$
where the odds are the odds of a person in the i'th group with $apache_score_i$ dying in the 30 days after being admitted to ICU.
We estimate that the probability that a patient who was admitted to ICU with sepsis dying within 30 days is:
between r resultStr1[1] if their APACHE II score was 10.
between r resultStr1[2] if their APACHE II score was 20.
between r resultStr1[3] if their APACHE II score was 30.
between r resultStr1[4] if their APACHE II score was 40.
We estimate an increase in a typical patients APACHE II score by 1 point is associated with increasing the odds of the patient dying in the next 30 days by between r resultStr3[1].
We estimate an increase in a typical patients APACHE II score by 5 points is associated with increasing the odds of the patient dying in the next 30 days by between r resultStr5[1].
Alternatively:
We estimate an increase in a typical patients APACHE II score by 1 point is associated with multiplying the odds of the patient dying in the next 30 days by between r resultStr2[1].
We estimate an increase in a typical patients APACHE II score by 5 points is associated with multiplying the odds of the patient dying in the next 30 days by between r resultStr4[1].
# Create predictions for before over range 0-42. pred2.df = data.frame(apache_score = seq(0,42)) Preds=predictGLM(fit, newdata=pred2.df, type="response") # Redo plot from above and add curve to the data. plot(deaths/number_of_patients~apache_score, ylab="prob. of death", xlab="APACHE II score", main="Plot with model superimposed",data=Apache.df) lines(seq(0,42),Preds[,1]) lines(seq(0,42),Preds[,2],col="Red",lty=2 ) lines(seq(0,42),Preds[,3],col="Red",lty=2 )
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.