knitr::opts_chunk$set(echo = TRUE, fig.retina = 2) # options(width = 80) suppressPackageStartupMessages(library(dplyr)) suppressPackageStartupMessages(library(ggplot2))
The $D$-score is a one-number summary measure of early child development. The $D$-score has a fixed unit. We may use the $D$-score to answer questions on the individual, group and population level. For more background, see the introductory booklet D-score: Turning milestones into measurement.
This vignette shows how to estimate the $D$-score and the $D$-score age-adjusted Z-score (DAZ) from child data on developmental milestones. The vignette covers some typical actions needed when estimating the $D$-score and DAZ:
dscorepackage covers your measurement instrument;
dscore package covers a subset of all possible assessment instruments. Moreover, it may have a restricted age range for a given instrument. Your first tasks are
dscorepackage can convert your measurements into $D$-scores;
keythat best suits your objectives.
The inventory by @fernald2017 identified 147 instruments for assessing the development of children aged 0-8 years. Well-known examples include the Bayley Scales for Infant and Toddler Development and the Ages & Stages Questionnaires. The $D$-score is defined by and calculated from, subsets of milestones from such instruments.
Assessment instruments connect to the $D$-score through a measurement model. We use the term key to refer to a particular instance of a measurement model. The
dscore package currently supports three keys:
dutch, a model developed for the Dutch development instrument;
gcdg, a model covering 14 instruments using direct measurements;
gsed, a model covering 20 instruments using a mix of direct and caregiver-reported measurements;
Although the differences between $D$-scores calculated under different keys are generally small, the results are not identical. Hence, we may compare only $D$-scores that are calculated under the same key. The table given below displays the number of items per instrument under each key. If the entry is blank, the key does not cover the instrument.
| Code | Instrument | Items | dutch | gcdg | gsed |
| ------ | ------------------------------------------------------- | -----:|------:|------:|------:|
aqi | Ages & Stages Questionnaires-3 | 230 | | 29 | 17 |
bar | Barrera Moncada | 22 | | 15 | 13 |
bat | Battelle Development Inventory and Screener-2 | 137 | | | |
by1 | Bayley Scales for Infant and Toddler Development-1 | 156 | | 85 | 76 |
by2 | Bayley Scales for Infant and Toddler Development-2 | 121 | | 16 | 16 |
by3 | Bayley Scales for Infant and Toddler Development-3 | 320 | | 105 | 67 |
cro | Caregiver Reported Early Development Instrument (CREDI) | 149 | | | 62 |
ddi | Dutch Development Instrument (Van Wiechenschema) | 77 | 76 | 65 | 64 |
den | Denver-2 | 111 | | 67 | 50 |
dmc | Developmental Milestones Checklist | 66 | | | 43 |
gri | Griffiths Mental Development Scales | 312 | | 104 | 93 |
iyo | Infant and Young Child Development (IYCD) | 90 | | | 55 |
kdi | Kilifi Developmental Inventory | 69 | | | |
mac | MacArthur Communicative Development Inventory | 6 | | 3 | 3 |
mds | WHO Motor Development Milestones | 6 | | | 1 |
mdt | Malawi Developmental Assessment Tool (MDAT) | 136 | | | 126 |
peg | Pegboard | 2 | | 1 | 1 |
pri | Project on Child Development Indicators (PRIDI) | 63 | | | |
sbi | Stanford Binet Intelligence Scales-4/5 | 33 | | 6 | 1 |
sgr | Griffiths for South Africa | 58 | | 19 | 19 |
tep | Test de Desarrollo Psicomotor (TEPSI) | 61 | | 33 | 31 |
vin | Vineland Social Maturity Scale | 50 | | 17 | 17 |
| | | 2275 | 76 | 565 | 807 |
| | Extensions | | | | |
rap | Global Scale of Early Development - RAPID SF | 139 | | | |
mul | Mullen Scales of Early Learning | 232 | | | 139 |
hyp | Demonstration items | 5 | | | |
| | | 2651 | 76 | 565 | 892 |
Unfortunately, it is not possible to calculate the $D$-score if your instrument is not on the list, or if all of its entries under the key headings are blank. You may wish to file an extension request to incorporate your instrument in a future version of the
dscore package. It remains an empirical question, however, whether the requested extension is possible.
For some instruments, e.g., for
cro only one choice is possible (
gri, we may choose between
"gsed". Your choice may depend on the goal of your analysis. If you want to compare to other $D$-scores calculated under key
"gcdg", or reproduce an analysis made under that key, then pick
"gcdg". If that is not the case, then
"gsed" is probably a better choice because of its broader generalizability. The default key is
The extensions for Mullen were added to the gsed key. The extension was made based on two datasets, the Provide dataset [@provide] and the Bambam dataset [@bambam]. The Mullen items were matched to existing items and two well fitting items were selected as anchors in a new model on the combined Provide and Bambam data.
library(dscore) ib <- builtin_itembank %>% filter(key == "gsed") %>% mutate(a = get_age_equivalent(items = item, pct = 50, itembank = builtin_itembank)$a * 12) %>% select(a, instrument, label) %>% na.omit() ggplot(ib, aes(x = a, y = instrument, group = instrument)) + scale_y_discrete(limits = rev(unique(ib$instrument)), name = "") + scale_x_continuous(limits = c(0, 60), breaks = seq(0, 60, 12), name = "Age (months)") + geom_point(pch = 3, size = 1, colour = "blue") + theme_light() + theme(axis.text.y = element_text(hjust = 0, family = "mono"))
The designs of the original cohorts determine the age coverage for each instrument. The figure above indicates the age range currently supported by the
"gsed" key. Some instruments contain many items for the first two years (e.g.,
dmc), whereas others cover primarily upper ages (e.g.,
mul). If you find that the ages in your sample deviate from those in the figure, you may wish to file an extension request to incorporate new ages in a future version of the
dscore() function accepts item names that follow the GSED 9-position schema. A name with a length of nine characters identifies every milestone. The following table shows the construction of names.
Position | Description | Example
1-3 | instrument |
4-5 | developmental domain |
6 | administration mode |
7-9 | item number |
by3cgd018 refers to the 18th item in the cognitive scale of the Bayley-III. The label of the item can be obtained by
You may decompose item names into components as follows:
This function returns a
data.frame with four character vectors.
dscore package can recognise
r nrow(dscore::builtin_itemtable) item names. The expression
get_itemnames() returns a (long) vector of all known item names. Let us construct a table of instruments by domains:
items <- get_itemnames() din <- decompose_itemnames(items) knitr::kable(with(din, table(instrument, domain)), format = "html") %>% kableExtra::column_spec(1, monospace = TRUE)
We obtain the first three item names and labels from the
items <- head(get_itemnames(instrument = "mdt", domain = "gm"), 3) get_labels(items)
In practice, you need to spend some time to figure out how item names in your data map to those in the
dscore package. Once you've completed this mapping, rename the items into the GSED 9-position schema. For example, suppose that your first three gross motor MDAT items are called
data <- data.frame(id = c(1, 1, 2), age = c(1, 1.6, 0.9), mot1 = c(1, NA, NA), mot2 = c(0, 1, 1), mot3 = c(NA, 0, 1)) data
Renaming is easy to do by changing the names attribute.
old_names <- names(data)[3:5] new_names <- get_itemnames(instrument = "mdt", domain = "gm")[1:3] names(data)[3:5] <- new_names data
There may be different versions and revision of the same instrument. Therefore, carefully check whether the item labels match up with the labels in version of the instrument that you administered.
dscore package assumes that response to milestones are dichotomous (1 = PASS, 0 = FAIL). If necessary, recode your data to match these response categories.
Once the data are in proper shape, calculation of the $D$-score and DAZ is easy.
milestones dataset in the
dscore package contains responses of 27 preterm children measured at various age between birth and 2.5 years on the Dutch Development Instrument (
ddi). The dataset looks like:
head(milestones[, c(1, 3, 4, 9:14)])
Each row corresponds to a visit. Most children have three or four visits. Columns starting with
ddi hold the responses on DDI-items. A
1 means a PASS, a
0 means a FAIL, and
NA means that the item was not administered.
milestones dataset has properly named columns that identify each item. Calculating the $D$-score and DAZ is then done by:
ds <- dscore(milestones) dim(ds)
ds is a
data.frame with the same number of rows as the input data. The first six rows are
The table below provides the interpretation of the output:
Name | Interpretation
---- | -------------
a | Decimal age
n | number of items used to calculate $D$-score
p | Percentage of passed milestones
d | $D$-score estimate, mean of posterior
sem | Standard error of measurement, standard deviation of the posterior
daz | $D$-score corrected for age
milestones data and the result by
md <- cbind(milestones, ds)
We may plot the 27 individual developmental curves by
library(ggplot2) library(dplyr) r <- builtin_references %>% filter(pop == "dutch") %>% select(age, SDM2, SD0, SDP2) ggplot(md, aes(x = a, y = d, group = id, color = sex)) + theme_light() + theme(legend.position = c(.85, .15)) + theme(legend.background = element_blank()) + theme(legend.key = element_blank()) + annotate("polygon", x = c(r$age, rev(r$age)), y = c(r$SDM2, rev(r$SDP2)), alpha = 0.1, fill = "green") + annotate("line", x = r$age, y = r$SDM2, lwd = 0.3, alpha = 0.2, color = "green") + annotate("line", x = r$age, y = r$SDP2, lwd = 0.3, alpha = 0.2, color = "green") + annotate("line", x = r$age, y = r$SD0, lwd = 0.5, alpha = 0.2, color = "green") + coord_cartesian(xlim = c(0, 2.5)) + ylab(expression(paste(italic(D), "-score", sep = ""))) + xlab("Age (in years)") + scale_color_brewer(palette = "Set1") + geom_line(lwd = 0.1) + geom_point(size = 1)
Note that similarity of these curves to growth curves for body height and weight.
The DAZ is an age-standardized $D$-score with a standard normal distribution with mean 0 and variance 1. We plot the individual DAZ curves relative to the Dutch references by
ggplot(md, aes(x = a, y = daz, group = id, color = factor(sex))) + theme_light() + theme(legend.position = c(.85, .15)) + theme(legend.background = element_blank()) + theme(legend.key = element_blank()) + scale_color_brewer(palette = "Set1") + annotate("rect", xmin = -Inf, xmax = Inf, ymin = -2, ymax = 2, alpha = 0.1, fill = "green") + coord_cartesian(xlim = c(0, 2.5), ylim = c(-4, 4)) + geom_line(lwd = 0.1) + geom_point(size = 1) + xlab("Age (in years)") + ylab("DAZ")
Note that the $D$-scores and DAZ are a little lower than average. The explanation here is that these all children are born preterm. We can account for prematurity by correcting for gestational age.
The distributions of DAZ for boys and girls show that a large overlap:
ggplot(md) + theme_light() + scale_fill_brewer(palette = "Set1") + geom_density(aes(x = daz, group = sex, fill = sex), alpha = 0.4, adjust = 1.5, color = "transparent") + xlab("DAZ")
Under the assumption of independence, we may test whether sex differences are constant in age by a linear regression that includes the interaction between age and sex:
summary(lm(daz ~ age * sex, data = md))
This group of very preterms starts around -2.5 SD, followed by a catch-up in child development of approximately 1.0 SD per year. The size of the catch-up is equal for boys and girls.
We may account for the clustering effect by including random intercept and age effects, and rerun as
library(lme4) lmer(daz ~ 1 + age + sex + sex * age + (1 + age | id), data = md)
This analysis yields the same substantive conclusions as before.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.