optimal_design: Optimal design for growth reference centile studies

View source: R/optimal_design.R

optimal_designR Documentation

Optimal design for growth reference centile studies

Description

Two functions for estimating optimal sample size and sample composition when constructing growth reference centiles.

Usage

optimal_design(z = -2, lambda = NA, N = NA, SEz = NA, age = 10)

n_agegp(
  z = -2,
  lambda = NA,
  N = NA,
  SEz = NA,
  minage = 0,
  maxage = 20,
  n_groups = 20
)

Arguments

z

z-score on which to base the design, with default -2 which equates to the 2nd centile. If NA, optimal z is calculated from lambda.

lambda

power of age that defines the sample composition. The default NA means calculate optimal lambda from z.

N

total sample size per sex. The default NA means calculate from z or lambda, and SEz if provided.

SEz

target z-score standard error. The default NA means calculate from z or lambda, and N if provided.

age

age at which to calculate SEz. The default 10 returns mean SEz, and if z or lambda are optimal SEz is independent of age.

minage

youngest age (default 0).

maxage

oldest age (default 20).

n_groups

number of age groups (default 20).

Details

Studies to construct growth reference centiles using GAMLSS need to be of optimal size. Cole (SMMR, 2020) has shown that the sample composition, i.e. the age distribution of the measurements, needs to be optimised as well as the sample size. Sample composition is defined in terms of the age power lambda which determines the degree of infant oversampling.

There are two criteria that determine the optimal sample size and sample composition: the centile of interest (as z-score z) and the required level of precision for that centile (as the z-score standard error SEz).

Value

For optimal_design, a tibble with columns:

z

as above.

lambda

as above.

N

as above.

SEz

as above.

age

as above.

p

the centile corresponding to z.

plo

lower 95% confidence interval for p.

phi

upper 95% confidence interval for p.

For n_agegp, a tibble giving the numbers of measurements to be collected per equal width age group, with columns:

n_varying

numbers for equal width age groups.

age

mean ages for equal width age groups.

n

number for each unequal width age group (only for longitudinal studies).

age_varying

target ages for unequal width age groups (only for longitudinal studies).

Author(s)

Tim Cole tim.cole@ucl.ac.uk

See Also

gamlss to fit the centiles with the BCCG, BCT or BCPE family.

Examples

## estimate optimal sample composition lambda and precision SEz for 9 centiles
## spaced 2/3 of a z-score apart, based on a sample of 10,000 children

optimal_design(z = -4:4*2/3, N = 10000)

## calculate age group sizes optimised for centiles from the 50th to the 99.6th
## (or equivalently from the 50th to the 0.4th)
## with a sample of 10,000 children from 0 to 20 years in one-year groups

purrr::map_dfc(0:4*2/3, ~{
  n_agegp(z = .x, N = 10000) %>%
      dplyr::select(!!z2cent(.x) := n_varying)
      }) %>%
        dplyr::bind_cols(tibble::tibble(age = paste(0:19, 1:20, sep='-')), .)

sitar documentation built on July 9, 2023, 6:51 p.m.