calculate2GBias: calculate2GBias

View source: R/NPSimulation.R

calculate2GBiasR Documentation

calculate2GBias

Description

The function simulates two-group experiments and estimates the power, individual estimate error, and the small sample bias obtained obtained from the set of simulated experiments. The set of simulations for a specific mean difference are repeated for three different values of the difference between the treatment and control groups specified by the parameter "diff". The power is estimated as the percentage of experiments for which the mean of the experiment was significantly different from zero. The experiment data may be one of four different type: Normal, Log-normal, Gamma or Laplace. The output is a table of values identifying the observed values of three effect sizes: Cliff's d, PHat and StdMD, estimate error and their related small sample bias and power for each set of simulated experiments. This function supports the production of the values reported in data tables in the paper "Recommendations for Analyzing Small Sample Size Software Engineering Experiments" and its Supplementary Material.

Usage

calculate2GBias(
  mean = 0,
  sd = 1,
  N,
  reps,
  diff = c(0.2, 0.5, 0.8),
  Expected.StdMD = c(0.2, 0.5, 0.8),
  Expected.PHat = c(0.556, 0.638, 0.714),
  type = "n",
  seed = 223,
  StdAdj = 0
)

Arguments

mean

This is the mean value of the control and treatment group(s) used in the simulations of each experiment for simulations of a specified sample size and mean difference (default 0).

sd

This is the standard deviation value of the control group(s) and treatment group(s) used in the simulations of each experiment of each family for simulations of a specified sample size (default 1).

N

This specifies the sample size per group that will be used in each set of simulations.

reps

The number of experiments simulated for each mean difference.

diff

This specifies the mean difference between the control and treatment that will be used in each set of simulations. It must always have three values representing small, medium and large differences (default c(0.2, 0.5, 0.8)).

Expected.StdMD

This defines the theoretical value of the average StdMD obtained from the simulations for each mean difference. (default c(0.2, 0.5, 0.8))

Expected.PHat

This defines the expected population value of the average Phat obtained from the simulations for each mean difference (default c(0.556,0.638,0.714)).

type

This specifies the distribution of the data samples that will be simulated. Options ae "n" for Normal, "l", for Log-normal,'g" for Gamma, "lap" for LaPlace (default "n").

seed

A seed for the simulations (default 123).

StdAdj

Used to introduce variance heterogeneity for Laplace and Normal samples (default 0).

Value

Design. Specifies the type of experiment, the sample distribution (n,l,g,lap), and whether variance heterogeneity was added (het)

GrpSize. Specifies the size of each group in the simulated experiments.

Diff. The size of the difference between the control and treatment converted to an ordinal scale (Small, Medium, Large)

NPBias The relative difference between the average of the observed values of either Cliff's d or centralised PHat and the population value

StdMDBias. The relative difference between the average of the observed values of StdMDBias and the theoretical value

NPMdMRE The median of the absolute relative difference between the observed values of either Cliff's d or centralised PHat and the theoretical value for each experiment.

StdMDMdMRE The median of the relative difference between the observed values of StdMD and the population value for each experiment.

ObsPHat. The average of the Phat values found in the set of simulations.

ObsCliffd. The average of the Cliffd values found in the set of simulations.

ObsStdES. The average of StdMD values found in the set of simulations.

PHatPower. The percentage of the simulations, for a specific mean difference, for which the Phat estimate was significantly different from zero at the 0.05 alpha level based on one-sided tests.

CliffdPower. The percentage of the simulations, for a specific mean difference, for which the Cliff's d estimate was significantly different from zero at the 0.05 alpha level based on one-sided tests.

StdMDPower. The percentage of the simulations, for a specific mean difference, for which the StdMD estimate was significantly different from zero at the 0.05 alpha level based on one-sided tests.

Author(s)

Barbara Kitchenham and Lech Madeyski

Examples

# as.data.frame(calculate2GBias(mean=0,sd=1,diff=c(0.2,0.5,0.8),Expected.StdMD=c(0.157,0.392,0.628),
#  Expected.PHat=c(0.544,0.609,0.671), N=5,reps=50, type="n", seed=523, StdAdj =0.5 ))
# Results for reps=100 (due to NOTE "Examples with CPU (user + system) or elapsed time > 5s"):
#    Design GrpSize   Diff        NPBias  StdMDBias  NPMdMRE StdMDMdMRE ObsPHat ObsCliffd  ObsSt..
# 1 2G_n_het       5  Small -6.308085e-16 0.07088601 3.272727  3.2700082  0.5440    0.0880 0.168..
# 2 2G_n_het       5 Medium  3.486239e-02 0.09914637 1.385321  1.3502057  0.6128    0.2256 0.430..
# 3 2G_n_het       5  Large  2.222222e-02 0.10446123 0.754386  0.8626523  0.6748    0.3496 0.693..
as.data.frame(calculate2GBias(mean=0,sd=1,diff=c(0.283,0.707104,1.131374),
 Expected.StdMD=c(0.157,0.392,0.628),Expected.PHat=c(0.556,0.636,0.705),N=10, reps=20,
 type="lap",seed=1423,StdAdj=0.5 ))
 #Parameter reps changed due to NOTE "Examples with CPU (user + system) or elapsed time > 5s"
 #Results for reps=100:
#      Design GrpSize   Diff      NPBias    StdMDBias   NPMdMRE StdMDMdMRE ObsPHat ObsCliffd  Ob..
#1 2G_lap_het      10  Small -0.11071429 -0.080855612 1.8928571  2.1256888  0.5498    0.0996 0.1..
#2 2G_lap_het      10 Medium -0.07426471  0.003940804 0.6323529  0.8170856  0.6259    0.2518 0.3..
#3 2G_lap_het      10  Large -0.05756098  0.023696619 0.4146341  0.5447941  0.6932    0.3864 0.6..

reproducer documentation built on Oct. 18, 2023, 5:10 p.m.