case1803: Smoking and Lung Cancer

Description Usage Format Source References Examples

Description

In a retrospective case-control study, researchers identified 86 lung cancer patients and 86 controls (without lung cancer), and categorized them according to whether they were smokers or non-smokers. The goal is to see whether the odds of lung cancer are greater for smokers than for non-smokers.

Usage

1

Format

A data frame with 2 observations on the following 3 variables.

Smoking

a factor with levels "NonSmokers" and "Smokers"

Cancer

the number of who were lung cancer patients

Control

the number who were controls

Source

Ramsey, F.L. and Schafer, D.W. (2013). The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed), Cenage Learning.

References

Anderson, T.W., Reid, D.B.W. and Beaton, G. H. (1972). Vitamin C and the Common Cold, Canadian Medial Association Journal 107: 503–508.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
str(case1803)
attach(case1803)
   
## INFERENCE
myTable   <- cbind(Cancer,Control)   # Make a 2-by-2 table of counts 
row.names(myTable)  <- Smoking   # Assign the levels of Smoking as row names  
myTable   

fisher.test(myTable,  alternative="greater")  # Alternative: that odds of Cancer 
  # in first row are greater.
fisher.test(myTable) # 2-sided alternative to get CI for odds ratio
myGlm1  <- glm(myTable ~ Smoking, family=binomial) # logistic reg (Ch 21)
summary(myGlm1)
exp(myGlm1$coef[2]) # 5.37963 : Estimated odds ratio
exp(confint(myGlm1)[2,]) #  1.675169 24.009510:  Approximate confidence interval
# Interpretation: The odds of cancer ar 5.4 times as large for smokers as for 
# non-smokers (95% confidence interval: 1.7 to 24.0 times as large).

detach(case1803)

Example output

'data.frame':	2 obs. of  3 variables:
 $ Smoking: Factor w/ 2 levels "NonSmokers","Smokers": 2 1
 $ Cancer : int  83 3
 $ Control: int  72 14
           Cancer Control
Smokers        83      72
NonSmokers      3      14

	Fisher's Exact Test for Count Data

data:  myTable
p-value = 0.004411
alternative hypothesis: true odds ratio is greater than 1
95 percent confidence interval:
 1.666128      Inf
sample estimates:
odds ratio 
  5.333256 


	Fisher's Exact Test for Count Data

data:  myTable
p-value = 0.008823
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.409691 30.094245
sample estimates:
odds ratio 
  5.333256 


Call:
glm(formula = myTable ~ Smoking, family = binomial)

Deviance Residuals: 
[1]  0  0

Coefficients:
               Estimate Std. Error z value Pr(>|z|)  
(Intercept)     -1.5404     0.6362  -2.421   0.0155 *
SmokingSmokers   1.6826     0.6563   2.564   0.0104 *
---
Signif. codes:  0***0.001**0.01*0.05.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance:  8.5043e+00  on 1  degrees of freedom
Residual deviance: -1.3323e-15  on 0  degrees of freedom
AIC: 12.293

Number of Fisher Scoring iterations: 3

SmokingSmokers 
       5.37963 
Waiting for profiling to be done...
    2.5 %    97.5 % 
 1.675169 24.009510 

Sleuth3 documentation built on May 2, 2019, 6:41 a.m.