est.mean.2g: Estimating Sample Means of Two Groups using Quantiles

View source: R/est.mean.2g.R

est.mean.2gR Documentation

Estimating Sample Means of Two Groups using Quantiles

Description

This function estimates the sample means from a two group study presenting quantile summary measures with the sample size (n). The quantile summaries of each group can fall into one of the following categories:

  • S_1: { minimum, median, maximum }

  • S_2: { first quartile, median, third quartile }

  • S_3: { minimum, first quartile, median, third quartile, maximum }

The est.mean.2g function uses a novel quantile-based distribution methods for estimating sample mean for two groups such as 'Treatment' and 'Control' (De Livera et al., 2024). The method is based on the following quantile-based distributions for estimating sample means:

  • Generalized Lambda Distribution (GLD) for estimating sample means using 5-number summaries (S_3).

  • Skew Logistic Distribution (SLD) for estimating sample means using 3-number summaries (S_1 and S_2).

Usage

est.mean.2g(
   min.g1 = NULL, 
   q1.g1 = NULL, 
   med.g1 = NULL, 
   q3.g1 = NULL, 
   max.g1 = NULL,
   min.g2 = NULL, 
   q1.g2 = NULL, 
   med.g2 = NULL, 
   q3.g2 = NULL, 
   max.g2 = NULL,
   n.g1, 
   n.g2, 
   opt = TRUE
)

Arguments

min.g1

numeric value representing the sample minimum of group 1.

q1.g1

numeric value representing the first quartile of group 1.

med.g1

numeric value representing the median of group 1.

q3.g1

numeric value representing the third quartile of group 1.

max.g1

numeric value representing the sample maximum of group 1.

min.g2

numeric value representing the sample minimum of group 2.

q1.g2

numeric value representing the first quartile of group 2.

med.g2

numeric value representing the median of group 2.

q3.g2

numeric value representing the third quartile of group 2.

max.g2

numeric value representing the sample maximum of group 2.

n.g1

numeric value specifying the sample size of group 1.

n.g2

numeric value specifying the sample size of group 2.

opt

logical value indicating whether to apply the optimization step in estimating the parameters of GLD or SLD. Default is TRUE.

Details

The est.mean.2g function implement the methods proposed by De Livera et al. (2024) for the two group case by incorporating shared information across the two groups to improve the accuracy of the estimates.

The generalised lambda distribution (GLD) is a four parameter family of distributions defined by its quantile function under the FKML parameterisation (Freimer et al., 1988). De Livera et al. propose that the GLD quantlie function can be used to approximate a sample's distribution using 5-point summaries (S_3). The four parameters of GLD quantile function include: a location parameter (\lambda_1), an inverse scale parameter (\lambda_2>0), and two shape parameters (\lambda_3 and \lambda_4). The est.mean.sld.2g function considers the case where the underlying distribution in each group has the same shape (i.e., common \lambda_3 and \lambda_4), and differ only in location and scale. Weights are used in the optimisation step in estimating \lambda_3 and \lambda_4 to put more emphasis on the group with the larger sample size.

The quantile-based skew logistic distribution (SLD), introduced by Gilchrist (2000) and further modified by van Staden and King (2015) is used to approximate the sample's distribution using 3-point summaries (S_1 and S_2). The SLD quantile function is defined using three parameters: a location parameter (\lambda), a scale parameter (\eta), and a skewing parameter (\delta). In est.mean.2g, an assumption of a common skewing parameter (\delta) is used for the two groups, so a pooled estimate of \delta is computed using weights based on the sample sizes.

Under each scenario, the parameters of the respective distributions are estimated by formulating and solving a series of simultaneous equations which relate the estimated quantiles with the population counterparts. The estimated mean is then obtained via integration of functions of the estimated quantile function.

Value

A list containing the estimated sample means for the two groups:

  • mean.g1: numeric value representing the estimated mean of group 1.

  • mean.g2: numeric value representing the estimated mean of group 2.

References

Alysha De Livera, Luke Prendergast, and Udara Kumaranathunga. A novel density-based approach for estimating unknown means, distribution visualisations, and meta-analyses of quantiles. Submitted for Review, 2024, pre-print available here: https://arxiv.org/abs/2411.10971

Marshall Freimer, Georgia Kollia, Govind S Mudholkar, and C Thomas Lin. A study of the generalized tukey lambda family. Communications in Statistics-Theory and Methods, 17(10):3547–3567, 1988.

Warren Gilchrist. Statistical modelling with quantile functions. Chapman and Hall/CRC, 2000.

P. J. van Staden and R. A. R. King. The quantile-based skew logistic distribution. Statistics & Probability Letters, 96:109–116, 2015.

See Also

est.mean for estimating means from one-group quantile data.

Examples

#Generate 5-point summary data for two groups
set.seed(123)
n_t <- 1000
n_c <- 1500
x_t <- stats::rlnorm(n_t, 4, 0.3)
x_c <- 1.1*(stats::rlnorm(n_c, 4, 0.3))
q_t <- c(min(x_t), stats::quantile(x_t, probs = c(0.25, 0.5, 0.75)), max(x_t))
q_c <- c(min(x_c), stats::quantile(x_c, probs = c(0.25, 0.5, 0.75)), max(x_c))
obs_mean_t <- mean(x_t)
obs_mean_c <- mean(x_c)

#Estimate sample mean using s3 (5 number summary)
est_means_s3 <- est.mean.2g(q_t[1],q_t[2],q_t[3],q_t[4],q_t[5],
                            q_c[1],q_c[2],q_c[3],q_c[4],q_c[5],
                            n.g1 = n_t,
                            n.g2 = n_c)
est_means_s3

#Estimate sample mean using s1 (min, med, max)
est_means_s1 <- est.mean.2g(min.g1=q_t[1], med.g1=q_t[3], max.g1=q_t[5],
                            min.g2=q_c[1], med.g2=q_c[3], max.g2=q_c[5],
                            n.g1 = n_t,
                            n.g2 = n_c)
est_means_s1

#Estimate sample mean using s2 (q1, med, q3)
est_means_s2 <- est.mean.2g(q1.g1=q_t[2], med.g1=q_t[3], q3.g1=q_t[4],
                            q1.g2=q_c[2], med.g2=q_c[3], q3.g2=q_c[4],
                            n.g1 = n_t,
                            n.g2 = n_c)
est_means_s2


metaquant documentation built on April 3, 2025, 10:34 p.m.