Estimate Sample Sizes based on a cgPairedDifferenceFit object


Estimate the sample size that would be required to detect a specified difference in a paired difference data study. The estimate is based on the variability that was observed in a previous paired difference data study. A cgPairedDifferenceSampleSizeTable class object is created.


## S4 method for signature 'cgPairedDifferenceFit'
samplesizeTable(fit, direction, mmdvec,
 power = 0.80, alpha = 0.05, nmax = 1000, display = "print", ...)



A cgPairedDifferenceFit object from a previous paired difference data study.


A character value indicating whether the sample size should be estimated to detect an "increase" or a "decrease". This only effects the sample size estimates if the previous study in fit was analyzed on the log scale, in which case the differences in mmdvec are relative differences instead of absolute differences. For detecting relative changes, the sample size required to detect a relative increase of 25% is not the same as the sample size to detect a relative decrease of 25%, for example. But for detecting absolute changes, the sample size required to detect an absolute increase of 25 is the same as the sample size to detect an absolute decrease of 25.


A numeric vector specifying the minimum meaningful differences to be detected in the future study. If the previous study in fit was analyzed on the log scale, then the values in mmdvec are assumed to be relative percentage increases or decreases, depending on the value of direction. If the previous study in fit was not analyzed on the log scale, then the values in mmdvec are assumed to be absolute increases or decreases, depending on the value of direction. Each value in mmdvec needs to be positive.


The power for the future study, set by default to be 0.80. This is equivalent to 1 - β, where β is the probability of committing a Type II error: accepting the null hypothesis of no difference when a difference truly exists.


The significance level or alpha for the future study, set by default as 0.05.


The maximum number of subjects per group. If more subjects are estimated to be required, than the exact number required is not reported, only the fact that more than the maximum number would be required. This is in place to prevent long and likely unnecessary calculations.


One of three valid values:


The default value; It calls a print method for the created
cgPairedDifferenceSamplesizeTable object, which is a formatted text output of the table(s).


Supresses any printing. Useful, for example, when just assignment of the resulting object is desired.


Calls the default showDefault method, which will just print out the cgPairedDifferenceSamplesizeTable components.


Additional arguments. Only one is currently valid:


Other than the NULL value, only the "df" value can be specified. This "df" values provides a degrees of freedom correction for using variance estimates based on different degrees of freedom. See details below.


Here, the estimated sample size actually refers to the number of experimental units. Hence the number of observations will always be twice the number of experimental units, due to the paired structure.

This sample size method only works for the classical least squares fitted model, since there is no analogous decomposition of total variance into between-experimental unit and within-experimental unit variance components. Sample sizes are estimated for detecting a minimum difference with the classical least squares t-test / F-test.

The correction = "df" argument specifies a method that Fleiss (1986, pages 129-130) attributes to Cochran and Cox (1957) and Fisher. The correction decreases the relative efficiency that is calculated from accounting for correlated paired observations, relative to the unpaired two group design. The adjustment accounts for the different degrees of freedom used for the variance components in the paired design (between-experimental unit, within-experimettal unit, total variability.)

Since the correction reduces the relative efficiency, and the noncentrality parameter is also reduced. The correction is a multiplicative factor bounded below by 0.833 and approaches 1 as the number of experimental units increments from the minimum of n=2. The reduction in the noncentrality parameter increases the computed sample size.


Creates an object of class cgPairedDifferenceSampleSizeTable, with the following slots:


A matrix with the estimated experimental unit sample sizes based on the classical model variance estimates. The matrix has 3 columns and one row for each element of the mmdvec vector. The first column specifies the minimum meaningful difference ("mmd"). The second column gives the number of experimental units ("n") required, possibly truncated at nmax. The third column gives the total number of observations ("N"), also possibly truncated at nmax. Since this for the paired groups design, N = n * 2 will always hold.


A list of properties mostly carried as-is from the data argument object of class cgPairedDifferenceData, with the following additional members:


A list with 1 member, ols, containing the estimated spread (sigma, standard deviation) variance estimates from the classical model of fit. This list component is a vector of length 3, providing the within-experimental unit, between experimental unit, and total variability estimates.


A character describing the study or purpose of the sample size analysis. Taken from the settings$analysisname of the fit object.


A saved copy of the direction argument.


A saved copy of the alpha argument.


A saved copy of the power argument.


A saved copy of the nmax argument.


Contact for bug reports, questions, concerns, and comments.


Bill Pikounis, John Oleynick, and Eva Ye


Fleiss, J. L. (1986). The Design and Analysis of Clinical Experiments, pages 129 - 130. New York: Wiley.

Cochran, W. G. and Cox, G. M. (1957), Experimental Designs. Second edition. Wiley.


data(anorexiaFT) <- prepareCGPairedDifferenceData(anorexiaFT, format="groupcolumns",
                                                 analysisname="Anorexia FT",
                                                 logscale=TRUE) <- fit(

## Recall the interest is in increased weight for the anorexia FT
## (family treatment) group of patients
samplesizeTable(, direction="increasing",
                mmdvec=c(5, 10, 15, 20))

## and with the adjustment on the noncentrality parameter
samplesizeTable(, direction="increasing",
                mmdvec=c(5, 10, 15, 20), correction="df")

Want to suggest features or report bugs for Use the GitHub issue tracker.

comments powered by Disqus