Hypothesis test for the difference between proportions

knitr::opts_chunk$set(echo = TRUE,comment=NA,fig.width=7,fig.height=5)
library(interpretCI)
library(glue)
x=propCI(n1=150,n2=100,p1=0.71,p2=0.63,P=0,alternative="greater")

two.sided<-greater<-less<-FALSE
if(x$result$alternative=="two.sided") two.sided=TRUE
if(x$result$alternative=="less") less=TRUE
if(x$result$alternative=="greater") greater=TRUE

twoS="The null hypothesis will be rejected if the proportion from population 1 is too big or if it is too small."
lessS="The null hypothesis will be rejected if the proportion from population 1 is too small."
greaterS="The null hypothesis will be rejected if the proportion from population 1 is too big."

This document is prepared automatically using the following R command.

call=paste0(deparse(x$call),collapse="")
x1=paste0("library(interpretCI)\nx=",call,"\ninterpret(x)")
textBox(x1,italic=TRUE,bg="grey95",lcolor="grey50")

Problem

string=glue("Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is equally effective for men and women. To test this claim, they choose a a simple random sample of {x$result$n1} women and {x$result$n2} men from a population of {(x$result$n1+x$result$n2)*50} volunteers.

At the end of the study, {x$result$p1*100}% of the women caught a cold; and {x$result$p2*100}% of the men caught a cold. Based on these findings, can we reject the company's claim that the drug is {ifelse(two.sided,'equally',ifelse(less,'more','less'))} effective for men {ifelse(two.sided,'and','compared to')} women? Use a {x$result$alpha} level of significance.")

textBox(string)

Solution

This lesson explains how to conduct a hypothesis test to determine whether the difference between two proportions is significant.

The test procedure, called the two-proportion z-test, is appropriate when the following conditions are met:

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

Since the above requirements are satisfied, we can use the following four-step approach to construct a confidence interval.

1. State the hypotheses

The first step is to state the null hypothesis and an alternative hypothesis.

$$Null\ hypothesis(H_0): P_1 r ifelse(two.sided,"=",ifelse(less,"\\geqq","\\leqq")) P_2$$ $$Alternative\ hypothesis(H_1): P_1 r ifelse(two.sided, "\\neq" ,ifelse(less,"<",">")) P_2$$

Note that these hypotheses constitute a r ifelse(two.sided,"two","one")-tailed test. r ifelse(two.sided,twoS,ifelse(less,lessS,greaterS)).

2. Formulate an analysis plan

For this analysis, the significance level is `r x$result$alpha``. The test method, shown in the next section, is a two-proportion z-test.

3. Analyze sample data

Using sample data, we calculate the pooled sample proportion (p) and the standard error (SE). Using those measures, we compute the z-score test statistic (z).

$$p=\frac{p_1 \times n_1+ p_2 \times n_2}{n1+n2}$$ $$p=\frac{r x$result$p1 \times r x$result$n1+ r x$result$p2 \times r x$result$n2}{r x$result$n1+r x$result$n2}$$

$$p=r x$result$p1*x$result$n1+x$result$p2*x$result$n2/r x$result$n1+x$result$n2=r round(x$result$ppooled,3)$$

$$SE=\sqrt{p\times(1-p)\times[1/n_1+1/n_2]}$$

$$SE=\sqrt{r round(x$result$ppooled,3)\timesr round(1-x$result$ppooled,3)\times[1/r x$result$n1+1/r x$result$n2]}=r round(x$result$se,3)$$

$$z=\frac{p_1-p_2}{SE}=\frac{r x$result$p1-r x$result$p2}{r round(x$result$se,3)}=r round(x$result$z,2)$$

where $p_1$ is the sample proportion in sample 1, where $p_2$ is the sample proportion in sample 2, $n_1$ is the size of sample 1, and $n_2$ is the size of sample 2.

Since we have a r ifelse(two.sided,"two","one")-tailed test, the P-value is the probability that the z statistic is r if(!greater) "less than" r if(!greater) round(-abs(x$result$z),2) r if(!less) "or greater than " r if(!less) round(abs(x$result$z),2).

We can use following R code to find the p value.

if(two.sided){
               string=glue("pnorm(-abs({round(x$result$z,2)}))\\times2")
} else if(greater){
               string=glue("pnorm({round(x$result$z,2)},lower.tail=FALSE)")
} else{
               string=glue("pnorm({round(x$result$z,2)})")
          }

$$p=r string=r round(x$result$pvalue,3)$$

Alternatively,we can use the Normal Distribution curve to find p value.

draw_n(z=x$result$z,alternative=x$result$alternative)

4. Interpret results.

Since the P-value (r round(x$result$pvalue,3)) is r ifelse(x$result$pvalue>x$result$alpha,"greater","less") than the significance level (r x$result$alpha), we cannot r ifelse(x$result$pvalue>x$result$alpha,"reject","accept") the null hypothesis.

Result of propCI()

print(x)

Reference

The contents of this document are modified from StatTrek.com. Berman H.B., "AP Statistics Tutorial", [online] Available at: https://stattrek.com/hypothesis-test/difference-in-proportions.aspx?tutorial=AP URL[Accessed Data: 1/23/2022].



Try the interpretCI package in your browser

Any scripts or data that you put into this service are public.

interpretCI documentation built on Jan. 28, 2022, 9:07 a.m.