Home

/

GitHub

/

BMC: A case study of smoothing Bone Mineral Content longitudinal...

BMC: A case study of smoothing Bone Mineral Content longitudinal...
In ZheyuanLi/gps.mgcv: General P-Splines for Package 'mgcv'

BMC	R Documentation

A case study of smoothing Bone Mineral Content longitudinal data

Description

The synthetic dataset contains BMC measured at growing ages for white ethnic subjects, with 932 data from 112 males and 1113 data from 127 females. An additive mixed model based on general P-splines is ideal for analyzing this dataset, able to produce smooth estimations of population and subject BMC trajectories.

Usage

synBMC(gender)

FitBMC(gender, p, gps = TRUE, subset = NULL)

cvBMC(gender, gps = TRUE, n.sim = 100, nc = 1)

Arguments

`gender`	the gender group ("male" or "female") to study.
`p`	number of B-splines to represent `f(x)` and `f_j(x)` (see Details).
`gps`	TRUE to construct `f(x)` and `f_j(x)` as general P-splines; FALSE to construct them as standard P-splines.
`subset`	an optional vector of integers that specifies a data subset to be used for model fitting (for example, it is used when `cvBMC` calls `FitBMC` on a training set during cross-validation).
`n.sim`	number of simulations in cross-validation.
`nc`	number of CPU cores to use for parallel simulations.

Details

Data Description

For either gender group, the synthetic BMC dataset is a dataframe with four variables:

id, subject's ID ("M001" to "M112" for males and "F001" to "F127" for females);
age, subject's age when a measurement was taken;
bmc, BMC measurements (in grams);
log.bmc, log-transformed BMC measurements.

It is a synthetic dataset that highly resembles the real one from The Saskatchewan Pediatric Bone Mineral Accrual Study (PBMAS).

Additive Mixed Model

Denote age, BMC and log(BMC) by x, y and z, respectively. The model for either gender group is:

z_{j,x} = f(x) + f_j(x) + e_{j,x},

where

f(x) (a fixed-effect spline) is the population log(BMC) trajectory;
f_j(x) (a random-effect spline) is subject j's log(BMC) deviation from f(x);
e_{j, x} is an i.i.d. Gaussian measurement error.

FitBMC sets up f(x) and f_j(x) as standard or general P-splines then fits this model with gamm.

How Synthetic Data are Simulated

Let n be the number of subjects in either gender group. Let vectors \bm{x_j}, \bm{y_j} and \bm{z_j} be the data for subject j. Let \delta = \min\{\Delta\bm{x_1}, \Delta\bm{x_2}, \ldots, \Delta\bm{x_n}\} be the minimum step size in age values. The synthetic data are simulated as follows.

Fit the additive mixed model above to the real data (\bm{z_j}, \bm{x_j}, j)_{1}^{n};
Randomize the association between \bm{x_j} and j to obtain (\bm{x_j}, r_j)_{1}^{n}, where r_1, r_2, \ldots, r_{n} is a random shuffle of 1, 2, \ldots, n;
Add a small amount of noise ("jitter") to obtain \bm{\tilde{x}_j} = \bm{x_j} + \bm{\delta_j}, where \bm{\delta_j} is vector of samples from uniform distribution on [-\delta/5, -\delta/10] \cup [\delta/10, \delta/5];
Predict \bm{\hat{z}_j} on (\bm{\tilde{x}_j}, r_j)_{1}^{n} using the fitted model and draw \bm{\tilde{z}_j} \sim \textrm{N}(\bm{\hat{z}_j}, \bm{I}\hat{\sigma}_e^2), where \hat{\sigma}_e^2 is the estimated variance of measurement error and \bm{I} is an identity matrix;
Back transform \bm{\tilde{y}_j} = \exp(\bm{\tilde{z}_j}) and compose the synthetic dataset as (\bm{\tilde{y}_j}, \bm{\tilde{x}_j}, r_j)_{1}^{n}.

The synthetic dataset preserves two aspects of the real dataset.

Distribution of age values is almost unchanged;
The number of data per subject is unchanged.

The fitted model has an R-squared as high as 99.97% for males and 99.92% for females. Thus, statistical modeling on two datasets does not end up with much difference.

Value

synBMC returns a data frame of four variables.

FitBMC returns a fitted "gamm" model, which is a large list with an "lme" component and a (much more useful) "gam" component.

cvBMC returns a matrix of 3 columns. Column 1 gives the mean cross-validation error (averaged over n.sim simulations) for p = 5, 6, \ldots, 20. Columns 2 and 3 give two-standard-error lower and upper bounds for this mean.

Note

We do not hold authentic BMC data. The synthetic version was generated by Ahmed Elhakeem (using our idea and code, though), who holds a copy of the authentic version for his research (see References) collaborated with PBMAS's program director, Professor Adam DG Baxter-Jones. The synthetic dataset is also shared by Elhakeem at https://raw.githubusercontent.com/aelhak/nltmr/main/synth_cohort.csv. Researches interested in the authentic BMC data should contact Professor Baxter-Jones (baxter.jones@usask.ca) in the first place.

Author(s)

Zheyuan Li zheyuan.li@bath.edu

References

Bailey, D. A. (1997). The Saskatchewan Pediatric Bone Mineral Accrual Study: Bone mineral acquisition during the growing years. International Journal of Sports Medicine, 18:s191-s194.

Baxter-Jones, A. D., Faulkner, R. A., Forwood, M. R., Mirwald, R. L., and Bailey, D. A. (2011). Bone mineral accrual from 8 to 30 years of age: An estimation of peak bone mass. Journal of Bone and Mineral Research, 26(8):1729-1739.

Elhakeem, A., Hughes, R., Tilling, K., Cousminer, D., Jackowski, S., Cole, T., Kwong, A., Li, Z., Grant, S., Baxter-Jones, A., Zemel, B., and Lawlor, D. (2022). Using linear and natural cubic splines, sitar, and latent trajectory models to characterise nonlinear longitudinal growth trajectories in cohort studies. BMC Medical Research Methodology, 22(68).

Li, Z. and Cao, J. (2022). General P-Splines for Non-Uniform B-Splines.

Examples

require(mgcv)
require(gps.mgcv)

## get synthetic BMC data
MBMC <- synBMC(gender = "male")
FBMC <- synBMC(gender = "female")

## number of subjects
NM <- nlevels(MBMC$id)
NF <- nlevels(FBMC$id)

## split data by subject
MBMC.id <- split(MBMC[-1], MBMC$id)
FBMC.id <- split(FBMC[-1], FBMC$id)

## visualize data
op <- par(mfrow = c(2, 3), mar = c(2, 2, 1.5, 0.5))
## male: bmc ~ age
with(MBMC, plot(age, bmc, type = "n", ann = FALSE))
for (i in 1:NM) with(MBMC.id[[i]], lines(age, bmc, col = i))
title("male: bmc ~ age")
## male: log(bmc) ~ age
with(MBMC, plot(age, log.bmc, type = "n", ann = FALSE))
for (i in 1:NM) with(MBMC.id[[i]], lines(age, log.bmc, col = i))
title("male: log(bmc) ~ age")
## male: hist(age)
hist(MBMC$age, breaks = 15, ann = FALSE)
title("male: age distribution")
## female: bmc ~ age
with(FBMC, plot(age, bmc, type = "n", ann = FALSE))
for (i in 1:NF) with(FBMC.id[[i]], lines(age, bmc, col = i))
title("female: bmc ~ age")
## female: log(bmc) ~ age
with(FBMC, plot(age, log.bmc, type = "n", ann = FALSE))
for (i in 1:NF) with(FBMC.id[[i]], lines(age, log.bmc, col = i))
title("female: log(bmc) ~ age")
## female: hist(age)
hist(FBMC$age, breaks = 15, ann = FALSE)
title("female: age distribution")
par(op)

## Not run: 

## fit models for male group
gps.10 <- FitBMC(gender = "male", p = 10, gps = TRUE)$gam
sps.10 <- FitBMC(gender = "male", p = 10, gps = FALSE)$gam

## newdata
age <- seq.int(min(MBMC$age), max(MBMC$age), length.out = 100)
newdat1 <- data.frame(age = rep.int(age, NM),
                      id = rep(levels(MBMC$id), each = length(age)))
newdat2 <- data.frame(age = age, id = levels(MBMC$id)[1])

## predict all subject BMC trajectories
sub.gps.10 <- matrix(exp(predict(gps.10, newdat1)), nrow = length(age))
sub.sps.10 <- matrix(exp(predict(sps.10, newdat1)), nrow = length(age))

## predict population BMC trajectory
pop.gps.10 <- predict(gps.10, newdat2, type = "terms", terms = "s(age)")
pop.sps.10 <- predict(sps.10, newdat2, type = "terms", terms = "s(age)")
pop.gps.10 <- exp(as.numeric(pop.gps.10) + attr(pop.gps.10, "constant"))
pop.sps.10 <- exp(as.numeric(pop.sps.10) + attr(pop.sps.10, "constant"))

## extract knots
knots.gps.10 <- gps.10$smooth[[1]]$knots
knots.sps.10 <- sps.10$smooth[[1]]$knots

## plot data and fitted trajectories
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5))
## a common ylim for the next 2 plots
ylim <- range(sub.gps.10[, c(41, 66)], sub.sps.10[, c(41, 66)])
## standard P-spline, population trajectory and subject 41, 66
plot(age, pop.sps.10, type = "l", lwd = 2, ylim = ylim, ann = FALSE)
with(MBMC.id[[41]], points(age, bmc, pch = 19, cex = 1.5, col = 8))
with(MBMC.id[[66]], points(age, bmc, pch = 19, cex = 1.5, col = 8))
abline(v = knots.sps.10, lty = 2, col = 8)
lines(age, sub.sps.10[, 41], lwd = 2, lty = 2)
lines(age, sub.sps.10[, 66], lwd = 2, lty = 2)
title("standard P-spline")
## general P-spline, population trajectory and subject 41, 66
plot(age, pop.gps.10, type = "l", lwd = 2, ylim = ylim, ann = FALSE)
with(MBMC.id[[41]], points(age, bmc, pch = 19, cex = 1.5, col = 8))
with(MBMC.id[[66]], points(age, bmc, pch = 19, cex = 1.5, col = 8))
abline(v = knots.gps.10, lty = 2, col = 8)
lines(age, sub.gps.10[, 41], lwd = 2, lty = 2)
lines(age, sub.gps.10[, 66], lwd = 2, lty = 2)
title("general P-spline")
par(op)

## End(Not run)

ZheyuanLi/gps.mgcv documentation built on Sept. 19, 2024, 9:11 p.m.

ZheyuanLi/gps.mgcv index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ZheyuanLi/gps.mgcv
General P-Splines for Package 'mgcv'

BMC: A case study of smoothing Bone Mineral Content longitudinal...
In ZheyuanLi/gps.mgcv: General P-Splines for Package 'mgcv'

A case study of smoothing Bone Mineral Content longitudinal data

Description

Usage

Arguments

Details

Data Description

Additive Mixed Model

How Synthetic Data are Simulated

Value

Note

Author(s)

References

Examples

Related to BMC in ZheyuanLi/gps.mgcv...

R Package Documentation

Browse R Packages

We want your feedback!

ZheyuanLi/gps.mgcv General P-Splines for Package 'mgcv'

BMC: A case study of smoothing Bone Mineral Content longitudinal... In ZheyuanLi/gps.mgcv: General P-Splines for Package 'mgcv'

A case study of smoothing Bone Mineral Content longitudinal data

Description

Usage

Arguments

Details

Data Description

Additive Mixed Model

How Synthetic Data are Simulated

Value

Note

Author(s)

References

Examples

Related to BMC in ZheyuanLi/gps.mgcv...

R Package Documentation

Browse R Packages

We want your feedback!

ZheyuanLi/gps.mgcv
General P-Splines for Package 'mgcv'

BMC: A case study of smoothing Bone Mineral Content longitudinal...
In ZheyuanLi/gps.mgcv: General P-Splines for Package 'mgcv'