TEST.diffprop: Tests on the difference between proportions

View source: R/UBStats_Main_Visible_ALL_202406.R

TEST.diffpropR Documentation

Tests on the difference between proportions

Description

TEST.diffprop() tests hypotheses on the difference between the proportion of successes in two independent populations.

Usage

TEST.diffprop(
  x,
  y,
  success.x = NULL,
  success.y = NULL,
  pdiff0 = 0,
  alternative = "two.sided",
  by,
  digits = 2,
  force.digits = FALSE,
  use.scientific = FALSE,
  data,
  ...
)

Arguments

x, y

Unquoted strings identifying the variables of interest. x and y can be the names of vectors or factors in the workspace or the names of columns in the data frame specified in the data argument. It is possible to use a mixed specification (e.g, one vector and one column in data).

success.x, success.y

If x,y are factors, character vectors, or numeric non-binary vectors, success must be used to indicate the category/value corresponding to success in the populations. These arguments can be omitted (NULL, default) if x,y are binary numeric vectors (taking values 0 or 1 only; in this case success is assumed to correspond to 1) or a logical vector (in these cases success is assumed to correspond to TRUE).

pdiff0

Numeric value that specifies the null hypothesis to test for (default is 0).

alternative

A length-one character vector specifying the direction of the alternative hypothesis. Allowed values are "two.sided" (difference between populations' proportions differs from pdiff0; default), or "less" (difference between populations' proportions is lower than pdiff0), or "greater" (difference between populations' proportions is higher than pdiff0).

by

Optional unquoted string identifying a variable (of any type), defined same way as x, taking only two values used to split x into two independent samples. Given the two ordered values taken by by (alphabetical or numerical order, or order of the levels for factors), say by1 and by2, hypotheses are tested on the difference between the populations proportions in the by1- and in the by2-group. Note that only one between y and by can be specified.

digits

Integer value specifying the number of decimals used to round statistics; default to 2. If the chosen rounding formats some non-zero values as zero, the number of decimals is increased so that all values have at least one significant digit, unless the argument force.digits is set to TRUE.

force.digits

Logical value indicating whether reported values should be forcedly rounded to the number of decimals specified in digits even if non-zero values are rounded to zero (default to FALSE).

use.scientific

Logical value indicating whether numbers in tables should be displayed using scientific notation (TRUE); default to FALSE.

data

An optional data frame containing x and/or y or by. If not found in data, the variables are taken from the environment from which TEST.diffprop() is called.

...

Additional arguments to be passed to low level functions.

Value

A table reporting the results of the test on the difference between the proportions of successes in two independent populations.

Author(s)

Raffaella Piccarreta raffaella.piccarreta@unibocconi.it

See Also

CI.diffprop() to build confidence intervals for the difference between two populations' proportions of successes.

Examples

data(MktDATA, package = "UBStats")

# Proportions of success defined on non-binary and 
#  non-logical vectors; 'success' coded same way
#  for both vectors
#  - Using x,y: build vectors with data on the two groups
WouldSuggest_F <- MktDATA$WouldSuggest[MktDATA$Gender == "F"]
WouldSuggest_M <- MktDATA$WouldSuggest[MktDATA$Gender == "M"]
TEST.diffprop(x = WouldSuggest_M, y = WouldSuggest_F, 
              success.x = "Yes", pdiff0 = 0.1, alternative = "less")

PastCampaigns_F<-MktDATA$PastCampaigns[MktDATA$Gender=="F"]
PastCampaigns_M<-MktDATA$PastCampaigns[MktDATA$Gender=="M"]
TEST.diffprop(x = PastCampaigns_M, y = PastCampaigns_F,
              success.x = 0, pdiff0 = 0.2)

#  - Using x,by: groups identified by ordered levels of by
TEST.diffprop(x = PastCampaigns, by = Gender,
              success.x=0, pdiff0 = 0.2, data = MktDATA)
#    Since order is F, M, test is on prop(F) - prop(M)
#    To get the interval for prop(M) - prop(F)
Gender.R <- factor(MktDATA$Gender, levels = c("M", "F"))
TEST.diffprop(x = PastCampaigns, by = Gender.R,
              success.x=0, pdiff0 = 0.2, data = MktDATA)

# Proportions of success defined based on 
#  binary or logical vectors; 'success'
#  coded same way for both vectors
#  - Binary variable (success=1): based on x,y
LastCampaign_F<-MktDATA$LastCampaign[MktDATA$Gender=="F"]
LastCampaign_M<-MktDATA$LastCampaign[MktDATA$Gender=="M"]
TEST.diffprop(x = LastCampaign_M, y = LastCampaign_F)
#  - Binary variable (success=1): based on x,y
#    see above for recoding of levels of Gender
TEST.diffprop(x = LastCampaign, by = Gender, data = MktDATA)
Gender.R <- factor(MktDATA$Gender, levels = c("M", "F"))
TEST.diffprop(x = LastCampaign, by = Gender.R, data = MktDATA)
#  - Logical variable (success=TRUE): based on x,y
Deals_w_child <- MktDATA$Deals.ge50[MktDATA$Children>0]
Deals_no_child <- MktDATA$Deals.ge50[MktDATA$Children==0]
TEST.diffprop(x = Deals_w_child, y = Deals_no_child, 
              pdiff0 = 0.2, alternative = "less",)
# Proportions defined on 
#  non-binary and non-logical vectors, with 'success'
#  coded differently (only specification x,y is reasonable here)
WouldSuggest_Other<-c(rep("OK",310),rep("KO",650-310))
TEST.diffprop(x = WouldSuggest, y = WouldSuggest_Other, 
              success.x = "Yes", success.y = "OK",
              pdiff0 = 0.1, alternative = "greater",
              data = MktDATA)

# Proportions based on combined conditions
# - Build logical vector/s indicating whether a condition 
#   is satisfied
IsTop<-MktDATA$AOV>80
IsTop_OK<-IsTop[MktDATA$WouldSuggest == "Yes"]
IsTop_KO<-IsTop[MktDATA$WouldSuggest == "No"]
TEST.diffprop(x = IsTop_OK, y = IsTop_KO, pdiff0 = 0.05,
              alternative = "greater")

Deals<-MktDATA$NDeals>=5
Deals_Married <- Deals[MktDATA$Marital_Status=="Married" & 
                         MktDATA$Children==0] 
Deals_Single <- Deals[MktDATA$Marital_Status=="Single"] 
TEST.diffprop(x = Deals_Married, y = Deals_Single,
              alternative = "less")

# Output results           
out.test_diffP<-TEST.diffprop(x = Deals_Married, y = Deals_Single,
                              alternative = "less")

# Arguments force.digits and use.scientific
#  An input variable taking very low values
HighAOV <- MktDATA$AOV>150
# - Default: manages possible excess of rounding
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"])
#  - Force to the exact number of digits (default, 2)
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"],
              force.digits = TRUE)
#  - Allow scientific notation
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"],
              use.scientific = TRUE)


UBStats documentation built on Sept. 11, 2024, 6:52 p.m.