TEST.diffprop: Tests on the difference between proportions
In UBStats: Basic Statistics

View source: R/UBStats_Main_Visible_ALL_202406.R

TEST.diffprop

R Documentation

Tests on the difference between proportions

Description

TEST.diffprop() tests hypotheses on the difference between the proportion of successes in two independent populations.

Usage

TEST.diffprop(
  x,
  y,
  success.x = NULL,
  success.y = NULL,
  pdiff0 = 0,
  alternative = "two.sided",
  by,
  digits = 2,
  force.digits = FALSE,
  use.scientific = FALSE,
  data,
  ...
)

Arguments

`x`, `y`	Unquoted strings identifying the variables of interest. `x` and `y` can be the names of vectors or factors in the workspace or the names of columns in the data frame specified in the `data` argument. It is possible to use a mixed specification (e.g, one vector and one column in data).
`success.x`, `success.y`	If `x,y` are factors, character vectors, or numeric non-binary vectors, success must be used to indicate the category/value corresponding to success in the populations. These arguments can be omitted (`NULL`, default) if `x,y` are binary numeric vectors (taking values 0 or 1 only; in this case success is assumed to correspond to 1) or a logical vector (in these cases success is assumed to correspond to `TRUE`).
`pdiff0`	Numeric value that specifies the null hypothesis to test for (default is 0).
`alternative`	A length-one character vector specifying the direction of the alternative hypothesis. Allowed values are `"two.sided"` (difference between populations' proportions differs from `pdiff0`; default), or `"less"` (difference between populations' proportions is lower than `pdiff0`), or `"greater"` (difference between populations' proportions is higher than `pdiff0`).
`by`	Optional unquoted string identifying a variable (of any type), defined same way as `x`, taking only two values used to split `x` into two independent samples. Given the two ordered values taken by `by` (alphabetical or numerical order, or order of the levels for factors), say by1 and by2, hypotheses are tested on the difference between the populations proportions in the by1- and in the by2-group. Note that only one between `y` and `by` can be specified.
`digits`	Integer value specifying the number of decimals used to round statistics; default to 2. If the chosen rounding formats some non-zero values as zero, the number of decimals is increased so that all values have at least one significant digit, unless the argument `force.digits` is set to `TRUE`.
`force.digits`	Logical value indicating whether reported values should be forcedly rounded to the number of decimals specified in `digits` even if non-zero values are rounded to zero (default to `FALSE`).
`use.scientific`	Logical value indicating whether numbers in tables should be displayed using scientific notation (`TRUE`); default to `FALSE`.
`data`	An optional data frame containing `x` and/or `y` or `by`. If not found in `data`, the variables are taken from the environment from which `TEST.diffprop()` is called.
`...`	Additional arguments to be passed to low level functions.

Value

A table reporting the results of the test on the difference between the proportions of successes in two independent populations.

Author(s)

Raffaella Piccarreta raffaella.piccarreta@unibocconi.it

Examples

data(MktDATA, package = "UBStats")

# Proportions of success defined on non-binary and 
#  non-logical vectors; 'success' coded same way
#  for both vectors
#  - Using x,y: build vectors with data on the two groups
WouldSuggest_F <- MktDATA$WouldSuggest[MktDATA$Gender == "F"]
WouldSuggest_M <- MktDATA$WouldSuggest[MktDATA$Gender == "M"]
TEST.diffprop(x = WouldSuggest_M, y = WouldSuggest_F, 
              success.x = "Yes", pdiff0 = 0.1, alternative = "less")

PastCampaigns_F<-MktDATA$PastCampaigns[MktDATA$Gender=="F"]
PastCampaigns_M<-MktDATA$PastCampaigns[MktDATA$Gender=="M"]
TEST.diffprop(x = PastCampaigns_M, y = PastCampaigns_F,
              success.x = 0, pdiff0 = 0.2)

#  - Using x,by: groups identified by ordered levels of by
TEST.diffprop(x = PastCampaigns, by = Gender,
              success.x=0, pdiff0 = 0.2, data = MktDATA)
#    Since order is F, M, test is on prop(F) - prop(M)
#    To get the interval for prop(M) - prop(F)
Gender.R <- factor(MktDATA$Gender, levels = c("M", "F"))
TEST.diffprop(x = PastCampaigns, by = Gender.R,
              success.x=0, pdiff0 = 0.2, data = MktDATA)

# Proportions of success defined based on 
#  binary or logical vectors; 'success'
#  coded same way for both vectors
#  - Binary variable (success=1): based on x,y
LastCampaign_F<-MktDATA$LastCampaign[MktDATA$Gender=="F"]
LastCampaign_M<-MktDATA$LastCampaign[MktDATA$Gender=="M"]
TEST.diffprop(x = LastCampaign_M, y = LastCampaign_F)
#  - Binary variable (success=1): based on x,y
#    see above for recoding of levels of Gender
TEST.diffprop(x = LastCampaign, by = Gender, data = MktDATA)
Gender.R <- factor(MktDATA$Gender, levels = c("M", "F"))
TEST.diffprop(x = LastCampaign, by = Gender.R, data = MktDATA)
#  - Logical variable (success=TRUE): based on x,y
Deals_w_child <- MktDATA$Deals.ge50[MktDATA$Children>0]
Deals_no_child <- MktDATA$Deals.ge50[MktDATA$Children==0]
TEST.diffprop(x = Deals_w_child, y = Deals_no_child, 
              pdiff0 = 0.2, alternative = "less",)
# Proportions defined on 
#  non-binary and non-logical vectors, with 'success'
#  coded differently (only specification x,y is reasonable here)
WouldSuggest_Other<-c(rep("OK",310),rep("KO",650-310))
TEST.diffprop(x = WouldSuggest, y = WouldSuggest_Other, 
              success.x = "Yes", success.y = "OK",
              pdiff0 = 0.1, alternative = "greater",
              data = MktDATA)

# Proportions based on combined conditions
# - Build logical vector/s indicating whether a condition 
#   is satisfied
IsTop<-MktDATA$AOV>80
IsTop_OK<-IsTop[MktDATA$WouldSuggest == "Yes"]
IsTop_KO<-IsTop[MktDATA$WouldSuggest == "No"]
TEST.diffprop(x = IsTop_OK, y = IsTop_KO, pdiff0 = 0.05,
              alternative = "greater")

Deals<-MktDATA$NDeals>=5
Deals_Married <- Deals[MktDATA$Marital_Status=="Married" & 
                         MktDATA$Children==0] 
Deals_Single <- Deals[MktDATA$Marital_Status=="Single"] 
TEST.diffprop(x = Deals_Married, y = Deals_Single,
              alternative = "less")

# Output results           
out.test_diffP<-TEST.diffprop(x = Deals_Married, y = Deals_Single,
                              alternative = "less")

# Arguments force.digits and use.scientific
#  An input variable taking very low values
HighAOV <- MktDATA$AOV>150
# - Default: manages possible excess of rounding
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"])
#  - Force to the exact number of digits (default, 2)
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"],
              force.digits = TRUE)
#  - Allow scientific notation
TEST.diffprop(x = HighAOV[MktDATA$Gender=="M"], 
              y = HighAOV[MktDATA$Gender=="F"],
              use.scientific = TRUE)

UBStats documentation built on Sept. 11, 2024, 6:52 p.m.