#' Calculate sample size for desired statistical power in proportion test
#'
#' Calculates needed sample size in AB test for a given baseline, improvement, statistical
#' significance and statistical power. Calculations are based on Center Limit Theorem
#' and sample size is found using binary search over lower_n, upper_n.
#'
#' @param baseline Base rate of success for the control group in range <0,1>
#' @param improvement Relative increase of baseline or absolute number that we
#' want our test to pick up, recognize with given statistical power
#' @param significance Statistical significance, probability of rejecting H0 when
#' it is actually true
#' @param wanted_power Statistical power we want our test to have, probability of
#' rejecting H0 when it is false and H1 is given by baseline and improvement
#' @param improvement_type 'relative' or 'absolute'. Is given improvement relative
#' to baseline or given in absolute number.
#' @param tolerance When searching wanted_power using binary search, how close we
#' need to get in order to finish searching. Defaults to 0.00001 = 0.001\%.
#' @param upper_n Upper limit for sample size when searching using binary search.
#' If solution is above upper_n we will get a message to increase it. Default
#' value is 1e7 = 10 million.
#' @param lower_n Lower limit for sample size when searching using binary search.
#' Default value is 1.
#'
#' @return Integer, estimated sample size needed in order to achieve wanted
#' statistical power.
#'
#' @examples
#' calculate_sample_size(
#' baseline = 0.3,
#' improvement = 0.02,
#' significance = 0.05,
#' wanted_power = 0.90,
#' improvement_type = "absolute")
#'
#' # [1] 11232
#'
#' @importFrom stats pnorm
#' @export
#' @author Elio Bartoš
calculate_sample_size = function(baseline,
improvement,
significance,
wanted_power,
improvement_type = "relative",
tolerance = 0.0001,
upper_n = 1e7,
lower_n = 1) {
power = 0
starting_upper_n = upper_n
if(improvement_type == "relative") {
baseline2 = baseline * (1+improvement)
} else if(improvement_type == "absolute") {
baseline2 = baseline + improvement
} else {
print("Wrong improvement_type! Possible values: 'absolute', 'relative'")
return(NA)
}
var_a = baseline * (1-baseline)
var_b = baseline2 * (1-baseline2)
while(lower_n < upper_n) {
n = as.integer((upper_n + lower_n)/2)
limit1 = qnorm(significance/2) + (baseline2 - baseline) / sqrt( (var_a + var_b) / n)
limit2 = qnorm(1 - significance/2) + (baseline2 - baseline) / sqrt( (var_a + var_b) / n)
power = pnorm(limit1) + (1-pnorm(limit2))
if(abs(power - wanted_power) < tolerance){
break
} else if (power - wanted_power > 0) { # too much power
upper_n = n-1
} else { # too little power
lower_n = n+1
}
}
if(n >= starting_upper_n - 1) {
print("Need larger space to search n. Increase upper_n limit!")
}
return(n)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.