library(medrc)
library(multcomp)

Motivation

The analysis of dose-response experiments using nonlinear models allows inference about the model parameters and the prediction of new response values. A common way to summarize the effect of an increasing dose level on the response is deriving parameters from the dose-response curve, like the effective dose $ED(p)$. Also the comparison of several curves by selectivity indices, like the relative potency, is available. For risk estimation in toxicology the estimation of benchmark dose (BMD) levels is a further important topic that is based on the nonlinear modeling of dose-response curves.

The drc package [Ritz and Streibig, 2005] allows the simultaneous fitting of several non-linear regression models, providing a common parameterization for several models and searching automatically for starting values. Several functions are available for model-averaging and inference of derived parameters.

When analyzing dose-response curves, the observed data is often obtained from experiments with hierarchical designs, where the responses can be assigned to several known clusters. Instead of just assuming a single residual error in the dose-response model, the variability within and between clusters can be modeled. Either additional correlation parameters can be introduced to structure the residual error by generalized nonlinear least squares estimation or a distribution function for the cluster effects can be assumed separately to the distribution of the residuals in a mixed model framework.

The nlme package [Pinheiro and Bates, 2000] provides functions to estimate parameters and variance components in nonlinear mixed models by (restricted) maximum likelihood. Some functions to parameterize the non-linear curve are already available, but it is also possible to provide your own modeling function.

The medrc combines the automated dose-response modeling framework of the package drc with the nonlinear mixed estimation framework of the package nlme. Thereby, additional random effects can be introduced to the dose-response models with the unified parameterization of drc, with the availability of inference for derived parameters also for hierarchical models.

The medrm function

Following the notation of Davidian and Giltinan (1995, 2003), a nonlinear regression model with a single hierarchical level can be defined in two stages that parameterize the intra- and inter-curve specific variation, respectively.

Stage 1: For the ith individual ($i=1,\dots,m$), we assume the following nonlinear regression model: $$ y_{ij} = f(x_{ij}, \boldsymbol{\beta}{i}) + \epsilon{ij} $$ where ${y_{i1}, \ldots, y_{i n_i} }$ and ${x_{i1}, \ldots, x_{i n_i} }$ denote the vectors of response values and dose levels, respectively. The population mean dose-response curve is characterized by the dose-response function $f$ through curve-specific effect $\boldsymbol{\beta}{i}$ (a $q \times 1$ vector). The residual error $\epsilon_i$ is assumed to be mean-zero normally distributed with variance-covariance matrix $\boldsymbol{\Lambda}{i}$, which in practice often is assumed to be $\sigma^2 I_{n_i}$ or $diag(\sigma_1^2, \ldots, \sigma_{n_i}^2)$.

Stage 2: The inter-curve variation is captured by splitting the curve-specific effect $\boldsymbol{\beta}{i}$ into components describing the systematic and random variation between curves: $$ \boldsymbol{\beta}{i} = \boldsymbol{A}{i}\boldsymbol{\beta} + \boldsymbol{B}{i}\boldsymbol{b}{i} $$ where $\boldsymbol{A}{i}$ and $\boldsymbol{B}_{i}$ denote the fixed-effects and random-effects design matrices, respectively, $\boldsymbol{\beta}$ denotes the fixed-effects parameters (a $p \times 1$ vector with $p \le q$), and the $b_i$'s denote the curve-specific random effects. The random effects can be assumed to follow a mean-zero normal distribution with a variance-covariance matrix denoted $\boldsymbol{G}$, which usually simply is the unstructured matrix.

We restrict our model to a fixed effects parameterization of independent curves, using a dummy 0 and 1 coded design matrix $\boldsymbol{A}_{i}$, and allowing only random intercepts additively on the fixed effects parameters to represent the subject variability. This hierarchical model will cover many experimental settings in bioassay analysis and toxicology, like the comparison of several treatments in a dose-response experiment, where the responses are repeated measurements on the same experimental units or the experimental design consists of several blocks.

Function arguments

To fit a hierarchical dose-response model with medrc, the medrm function can be used with following main arguments: form A formula with the name of the response $y_{ij}$ on the left and the name of the dose variable $x_{ij}$ on the right hand side curveid A formula with the name of a factor, which divides the dataset into several clusters, specifying the structure of the fixed effects design matrix $\boldsymbol{A}{i}$. For each cluster a different fixed effect curve is assumed. On the left hand side, the names of parameters can be given, separated by a $+$ symbol, which are assumed to be different across the curves. Hence, only a subset of parameters may describe the difference between fixed effect curves. If these parameter names are omitted, or curveid is set to NULL, only a single curve is fitted. data A data.frame object with the dose-response data fct A model function of package drc specifying the dose-response function $f()$ with a set of parameters $\boldsymbol{\beta}{i}$. Some predefined functions are shown in the following section. random The definition of random effects, similar to the random argument in the function nlme, as a definition of the random effect design matrix $\boldsymbol{B}_{i}$. This can be, for example, a formula with parameter names on the left hand side, like in the curveid argument, and 1$|$name of clusters defining factor on the right hand side. The 1 can also be substituted by an additional covariate. For a more flexible definition, a list of different pdClasses objects of nlme can be given. start Starting values for the nlme function. If NULL, the initial values for the fixed effects are found automatically, using the self start functionality of package drc.

Dose-response curve parameterization

In drc a number of different dose-response models are available. Each model can be simplified, by fixing a parameter to a specific values instead of estimating it from the data.

The parameters are defined in a unified way with b steepness of the curve c lower asymptote d upper asymptote e location of the inflection point * f asymmetry parameter

5-parameter logistic L.5()

$f(x) = c + \frac{d-c}{(1+\exp(b(x - e)))^f}$

5-parameter log-logistic LL.5()

$f(x) = c + \frac{d-c}{(1+\exp(b(\log(x)-\log(e))))^f}$

4 parameter Weibull W1.4() and W2.4()

$f(x) = c + (d-c) \exp(-\exp(b(\log(x)-\log(e))))$

or

$f(x) = c + (d-c) (1 - \exp(-\exp(b(\log(x)-\log(e)))))$

4 parameter log-Normal LN.4()

$f(x) = c + (d-c)(\Phi(b(\log(x)-\log(e))))$

gompertz()

$f(x) = c + (d-c)(\exp(-\exp(b(x-e))))$

3 parameter Michaelis-Menten MM.3()

$f(x, (c, d, e)) = c + \frac{d-c}{1+(e/x)}$

Further functions

More functions are available, like fractional polynomial-logistic models fplogistic(p1, p2), or the Brain-Cousens hormesis model braincousens().

Inference for derived parameters

Instead of interpreting the model parameters directly, we can set focus on derived parameters, like the effective dose at a specific level $ED(p)$ or the ratio of two effective doses, known as relative potency. The effective dose $ED(p)$ is defined as the solution to the following inverse regression problem: $$ f(ED(p), \boldsymbol{\beta}) = \frac{p}{100} f(\infty, \boldsymbol{\beta}{i}) + \left( 1 - \frac{p}{100} \right) f(0, \boldsymbol{\beta}{i}) $$ By definition $ED(p)$ values are relative quantities, relative to the lower and upper limits $\beta_2$ and $\beta_3$, which corresponds to $f(0, \boldsymbol{\beta}{i})$ and $f(\infty, \boldsymbol{\beta}{i})$, respectively, if $\beta_1<0$ (otherwise $\beta_2$ and $\beta_3$ swap places). The function ED() of package drc can be directly used with an medrc object to calculate the effective dose at (multiple) response levels conditional on random effects being equal to zero. Inference for the $ED(p)$ parameters, like hypotheses tests or corresponding confidence intervals, are available by using the Delta-method to approximate the variance-covariance of the derived parameters.

Another important concept used for summarizing dose-response data is the benchmark dose methodology. Ritz et al. (2013) proposed an operational definition of the benchmark dose concept for continuous endpoints, allowing for the incorporation of an a priori specified background level $p_0$ and benchmark response of interest BMR. The resulting BMD is obtained by solving the following equation:

$$ \left[ \Phi^{-1}(1-p_0) - \Phi^{-1}{1-(p_0+BMR)} \right] \frac{\sigma}{f(\infty,\boldsymbol{\beta}{i}) - f(0,\boldsymbol{\beta}{i})} = \frac{f(BMD,\boldsymbol{\beta}{i}) - f(0,\boldsymbol{\beta}{i})}{f(\infty,\boldsymbol{\beta}{i}) - f(0,\boldsymbol{\beta}{i})}. $$ By ignoring the variability in the ratio $\sigma {f(\infty,\boldsymbol{\beta}{i}) - f(0,\boldsymbol{\beta}{i})}^{-1}$ this definition implies that benchmark doses may be derived in the same way as ordinary effective doses defined above by specifying the non-standard response level. The function BMD() enables the computation of the BMD and lower confidence bounds (BMDL) for a medrc object, given the benchmark risk and a background risk level.

Inference for marginalized parameters

As all derived parameters, discussed in the previous section, are functions of the nonlinear model predictions, a population average interpretation is available when these derived parameters are based on the marginal predictions. Hence, the estimates for marginal derived parameters, like the effective dose, are directly available by obtaining marginal predictions.

The marginal expectation for a model with a single fixed effect curve and a single matrix of random effects is equal to $$ E\left{f(x_{ij}, \boldsymbol{\beta}{i}) \right} = \int \cdots \int f(x{ij}, (\beta_{1} + b_{1}, \dots, \beta_{p} + b_{p})) \; \Phi(b_{1}, \dots, b_{p}, \boldsymbol{G}) \; d b_{1} \cdots db_{p} $$ where $\Phi()$ denotes a multivariate Gaussian density with mean vector $\boldsymbol{0}$ and covariance structure $\boldsymbol{G}$.

As in nonlinear regression correlated random effects are usually the case, a change of variable can be performed, transforming the dependent random effects $\boldsymbol{b}{i}=\boldsymbol{\Omega u}{i}$ into a vector of spherical random effects $\boldsymbol{u}_{i} \sim N(\boldsymbol{0}, \boldsymbol{I})$. The matrix $\boldsymbol{\Omega}$ is a left factor of the covariance matrix $\boldsymbol{G} = \boldsymbol{\Omega\Omega}'$, using a Choleski decomposition.

The multiple integral can be approximated by the weighted sum $$ E\left{ f(x_{ij}, \boldsymbol{\beta}{i}) \right} \approx \sum{n=1}^{N} w_{n} f(x_{ij}, (\beta_{1} + \xi_{rn}, \dots, \beta_{p} + \xi_{rn})), \quad \mbox{with} \quad w_{n} = \prod_{r=1}^{p} w_{rn} $$ using numerical quadrature, with a $(p \times N)$ grid of nodes $\boldsymbol{\Psi}$ and corresponding weights $w_{rn}$, based on independent standard normal distribution functions [Smyth, 1998]; the quadrature nodes are transformed by $\boldsymbol{\xi} = \boldsymbol{\Psi\widehat{\Omega}}$ to reflect the scales and covariance of the random effects. The solution to the multiple integral can be thought of as an average of the individual curves, but instead of using the specific random effect predictions, this average is based on the empirical (multivariate) normal distribution of the random effects specified by the random effect covariances. Standard errors for derived parameters are obtained by applying the delta method, using a numerically calculated gradient.

Analogously to the ED() and BMD() functions, the EDmarg() and BMDmarg() functions are available for calculating marginalized estimates for the derived parameters. (The marginalization is only available for medrc objects with a simple random effects structure, allowing only with random intercepts for possibly nested random effects.)

Model-averaged estimates

By specifying the nonlinear function $f(x_{ij}, \boldsymbol{\beta}_{i})$ the principle shape of the curve is treated as known. Without any prior knowledge about the progress of the dose-response curve it is reasonable to choose from a larger set of dose-response models instead of assuming a single fixed function. With a set of candidate models a specific dose-response relationship can be selected based on the data, or model averaging approaches can be used to incorporate the model uncertainty into the parameter inference.

The framework of package drc allows to fit several dose-response models to the same data in an automated fashion, with several, predefined dose-response curves, all of them composed of a similar set of similar defined parameters with lower and upper asymptotes, steepness, inflection points, etc. As these model functions can be directly used in medrc, the formula interfaces enable the composition of a set of models with different fixed and random effect parameterizations and different dose-response shapes.

Model-averaged inference for the effective dose is available by the function mmaED(), allowing the input of several medrc model objects. By default the model parameters are estimated by maximum likelihood; it is not adviced to use the model averaging approach with REML estimates. Furthermore, only models with changes in the fixed effect parameterization OR the random effect structure should be combined.

Case Studies

Vinclozolin example

Nellemann et al. (2003) carried out experiments to assess the in vitro effects of the fungicide vinclozolin. The data were obtained using an androgen receptor reporter gene assay, which was repeated six times (on different days). Each assay resulted in concentration-response data with nine concentrations (in $\mu$ M) of the fungicide, and the response measured was chemiluminescence (in luminescence units), so the same nine concentrations were used in all six assays. However, in one assay, only eight concentrations were used.

The dataset is available in the package drc.

data(vinclozolin)

3 parameter log-logistic model

Assuming a lower asymptote at 0 for the control, a three parameter log-logistic model can be assumed, estimating the upper asymptote, location of the inflection point, and steepness of the curve. The assay effect is treated as a normally distributed random effect, summarizing the between-assay variability by a $(3 \times 3)$ covariance matrix.

m <- medrm(effect ~ conc, data=vinclozolin,
           random=b + d + e ~ 1|exper,
           fct=LL.3(), start=c(0.5, 2000, 0.05))
print(m)

Instead of assay-specific parameters in a nonlinear mixed model, population-averaged estimates can be obtained by a generalized nonlinear least-squares fit, assuming a compound-symmetry structure for the residuals to model the between assay variability.

mg <- glsdrm(effect ~ conc, data=vinclozolin,
             fct=LL.3(), start=c(0.5, 2000, 0.1), 
             correlation=corCompSymm(form=~1|exper), 
             control=gnlsControl(tolerance=0.01, nlsTol=0.1))
print(mg)

The predicted curves can easily be visualized by the plot method for the model objects. But to compare the assay-specific, marginalized, and marginal predictions a user defined plot can be constructed, using the predict method to obtain the desired predicted values for new dose levels.

nd <- expand.grid(conc=exp(seq(log(0.01), log(3.2), length=25)))
nd1 <- nd2 <- nd3 <- nd
nd1$p <- predict(m, type="marginal", newdata=nd)
nd1$model <- "Marginalized"
nd2$p <- predict(mg, newdata=nd)
nd2$model <- "GNLS"
nd3$p <- m$fct$fct(nd$conc, rbind(coefficients(m)))
nd3$model <- "Assay-specific"
nd <- rbind(nd1, nd2, nd3)
nd4 <- expand.grid(conc=exp(seq(log(0.01), log(3.2), length=25)), 
                   exper=unique(vinclozolin$exper))
nd4$p <- predict(m, newdata=nd4)
vinclozolin2 <- vinclozolin
vinclozolin2$conc[vinclozolin2$conc == 0] <- 0.01

ggplot(vinclozolin2, aes(x=conc, y=effect, group=exper, colour=exper)) + 
  geom_point() + 
  geom_line(data=nd4, aes(y=p, colour=exper), linetype=3) +
  geom_line(data=nd, aes(y=p, linetype=model, group=NULL, colour=NULL), size=1.1) +
  coord_trans(x="log") + 
  theme_classic() +
  scale_linetype_manual("Model", values=c(2,1,4)) +
  ylab("Chemiluminescence") +
  xlab("Concentration [muM]") +
  scale_x_continuous(breaks=c(0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2, 3), 
                    labels=c(0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2, 3))

Effective dose estimation

The effective doses can be estimated at several response levels, conditional on random effects being equal to zero, using the function ED().

ED(m, respLev=c(10, 25, 50, 75, 90))

For this example, not an assay-specific effective dose is of interest, but a population-averaged estimate of the $ED(p)$. Hence, either the EDmarg() function can be applied, or the ED() function together with the GNLS model.

EDmarg(m, respLev=c(10, 25, 50, 75, 90))
ED(mg, respLev=c(10, 25, 50, 75, 90))

Spinach example

Streibig and Dayan (1999) investigated the inhibition of photosynthesis in response to two synthetic photosystem II inhibitors, the herbicides diuron and bentazon. In an experiment, the effect of oxygen consumption of thylakoid membranes (chloroplasts) from spinach was measured after incubation with the synthetic inhibitors in five assays, three treated with bentazon and two with diuron. For each assay six increasing herbicide concentrations were applied together with a negative control, using different dose ranges on a logarithmic scale for the two treatments based on preliminary experiments to encompass the whole dose-response range.

The dataset is available in the package drc.

data(spinach)
spinach$CURVE <- as.factor(spinach$CURVE)

4 parameter log-logistic model

For the comparison of the two herbicides, two dose-response curves are fitted under assumption of a four parameter log-logistic model with a separate set of fixed effects coefficients for each treatment. Random effects are included for each of the four parameters to model the between assay variability. Using the information of the between assay variability by the additional distributional assumptions of the random intercepts is especially advantageous, as the dose levels for the two herbicides do not cover the same dose range.

As for the Vinclozolin example, either a nonlinear mixed model or a generalized nonlinear least-squares model can be fitted.

### NLME
spm <- medrm(SLOPE ~ DOSE, 
           curveid=b + c + d + e ~ HERBICIDE, 
           data=spinach, 
           random=b + c + d + e ~ 1|CURVE,
           fct=LL.4())

print(spm)

### GNLS
spmg <- glsdrm(SLOPE ~ DOSE, 
           curveid=b + c + d + e ~ HERBICIDE, 
           data=spinach, 
           correlation=corCompSymm(form=~1|CURVE),
           fct=LL.4())

print(spmg)

Like for the Vinclozolin example a user-defined graphic can be constructed, comparing the fitted GNLS curves with the marginalized predictions.

snd <- expand.grid(DOSE=exp(seq(log(0.01), log(150), length=100)), 
                   HERBICIDE=levels(spinach$HERBICIDE))
snd1 <- snd2 <- snd
snd1$p <- predict(spm, type="marginal", newdata=snd)
snd1$model <- "Marginalized"
snd2$p <- predict(spmg, newdata=snd)
snd2$model <- "GNLS"
snd <- rbind(snd1, snd2)
snd3 <- expand.grid(DOSE=exp(seq(log(0.01), log(150), length=100)), 
                    HERBICIDE=levels(spinach$HERBICIDE), 
                    CURVE=as.factor(1:5))
snd3$p <- predict(spm, newdata=snd3)
spinach2 <- spinach
spinach2$DOSE[spinach2$DOSE == 0] <- 0.01

ggplot(spinach2, aes(x=DOSE, y=SLOPE, group=HERBICIDE, shape=HERBICIDE, colour=CURVE)) + 
  geom_point(size=2.5) + 
  geom_line(data=snd3, aes(y=p, group=CURVE:HERBICIDE), linetype=3) +
  geom_line(data=snd, aes(y=p, linetype=model, group=HERBICIDE:as.factor(snd$model), 
                          shape=NULL, colour=NULL), size=1.1) +
  coord_trans(x="log") + 
  theme_classic() +
  scale_linetype_manual("Model", values=c(2,1,4)) +
  scale_shape_discrete("Herbicide") +
  ylab("Oxygen consumption of thylakoid membranes") +
  xlab("Concentration [muM]") +
  scale_x_continuous(breaks=c(0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1,2.5, 5,10,25,100), 
                     labels=c(0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1,2.5, 5,10,25,100))

Inference for the relative potency

The two fixed effect curves are compared by the ratio of effective dose estimates. As in this case also the marginal estimates are of interest, the marginalization approach can be similarly applied to the ratio of two parameters by extending the use of the delta method to calculate corresponding standard errors.

edm <- EDmarg(spm, respLev=c(10,25,50,75,90))[[2]]
library(mratios)
Cnum <- cbind(diag(5),matrix(0,5,5))
Cden <- cbind(matrix(0,5,5),diag(5))
rownames(Cnum) <- rownames(Cden) <- paste("bentazon/diuron ED:", c(10,25,50,75,90))
gsci.ratio(edm$coef, edm$vcov, Cnum, Cden, degfree=0)

CellTiter - Blue Cell Viability Assay example

Neurotoxicity was tested by a cell viability assay SH-SY5Y cells for increasing concentrations of acrylamide [Frimat et al. 2010]. The experiment was carried out over three days using a total of seven well plates, resulting in seven dose-response curves, measuring fluorescence at 11 or 12 doses and seven or eight replicates (672 observations in total).

data(ctb)
ctb$day <- as.factor(ctb$day)
ctb$dayplate <- as.factor(with(ctb, paste(day, plate, sep="/")))

5 parameter log-logistic model with a fixed lower asymptote at 0

For the dose-response relationship a five-parameter log-logistic model is assumed, where the lower asymptote is fixed at zero under the premise of no cell viability at highly toxic dose levels. To fix the parameter of the lower asymptote, a fixed argument is in the LL.5() function call. The day effects and the variability of plates are modeled as hierarchical random effects, additively on the four parameters: steepness, upper asymptote, inflection point, and curve asymmetry. Additional to a large between day variability, an increase in residual variance can be observed with higher fluorescence measurements; this fact is represented in the model by including an exponential variance function, modeling the change in scale with increasing concentration levels.

## medrm fit
mv <- medrm(fluorescence  ~ conc, fct=LL.5(fixed=c(NA, 0, NA, NA, NA)), 
            data=ctb, random=b+d+e+f~1|day/plate, 
            weights=varExp(form = ~conc|plate), 
            control=nlmeControl(maxIter=200))

summary(mv)

Also for this model the predicted curves on the plate, day, and population-average level can be displayed.

cnd <- expand.grid(conc=exp(seq(log(0.001), log(500), length=100)))
cnd1 <- cnd
cnd1$p <- predict(mv$fit, level=0, newdata=cnd)
cnd2 <- expand.grid(conc=exp(seq(log(0.001), log(500), length=100)), 
                    day=as.factor(levels(ctb$day)))
cnd2$p <- as.vector(apply(coef(mv$fit, level=1), 1, 
                          function(x) mv$fct$fct(cnd$conc, rbind(x))))
cnd3 <- expand.grid(conc=exp(seq(log(0.001), log(500), length=100)), 
                    dayplate=levels(ctb$dayplate))
cnd3$p <- as.vector(apply(coef(mv$fit, level=2), 1, 
                          function(x) mv$fct$fct(cnd$conc, rbind(x))))
ctb2 <- ctb
ctb2$conc[ctb2$conc == 0] <- 0.001

ggplot(ctb2, aes(x=conc, y=fluorescence, group=dayplate, shape=day, colour=dayplate)) + 
  geom_point(size=1.5) + 
  geom_line(data=cnd3, aes(y=p, group=dayplate, shape=NULL), linetype=2) +
  geom_line(data=cnd2, aes(y=p, group=day, colour=NULL), size=1.1, linetype=2, colour="grey2") +
  geom_line(data=cnd1, aes(y=p, group=NULL, shape=NULL, colour=NULL), size=1.2) +
  coord_trans(x="log") + 
  theme_classic() +
  guides(shape="none") +
  ylab("Fluorescence") +
  xlab("Concentration [mM]") +
  scale_x_continuous(breaks=c(0.001, 0.01, 0.05, 0.25, 1, 5,25,100,500), 
                     labels=c(0.001, 0.01, 0.05, 0.25, 1, 5,25,100,500))

Benchmark dose calculation

The benchmark dose and the benchmark dose lower confidence limit can be calculated, marginalizing the estimate to obtain a population-average interpretation. Multiple benchmark risk levels are assumed and a background level of 0.05. To save some computation time, the Gauss quadrature is reduced to only two nodes for each random effect. As some BMDLs are computed near the boundary of a concentration at 0 $\mu$ M, the interval limits are calculated on a logarithmic scale. The confidence level is specified at 0.9, to maintain an error level of 0.05 for the lower confidence limit (ignoring the upper limit).

BMDmarg(mv, respLev=c(5, 10, 25, 50), nGQ=2, interval="tfls", level=0.9)

3T3 mouse fibroblasts and NRU assay

The toxicity of sodium valproate was tested, using the 3T3 mouse fibroblasts and neutral red uptake (NRU) assay. 22 different experiments were performed independently in six laboratories, using eight concentration levels, each with six replicates on a 96-well plate. In addition, twelve measurements were taken for the solvent control. See Clothier et al. (2013) for more information.

data(mdra)

3-parameter log-logistic model

A 3-parameter log-logistic model was fitted with two levels of hierarchical random effects, treating the laboratory effect and the experiment effect on the upper asymptote and on the ED50 parameter as a random effect. In this way, the hierarchical layout of the data is represented, modelling the variability between laboratories and between experiments within a laboratory. No random laboratory or experiment variation is assigned to the slope parameters in order to reduce model complexity and ensure convergence of the parameter estimation algorithm.

mdramod <- medrm(Response ~ Concentration, data=mdra, fct=LL.3(), 
           random=d + e ~ 1|LabID/ExperimentID, 
           weights=varExp(form=~Concentration),
           control=nlmeControl(tolerance=0.1, pnlsTol=1)) 
plot(mdramod, logx=TRUE, ndose=250, ranef=TRUE) + theme_classic()

Benchmark dose estimation

The benchmark dose at benchmark risk levels of 1%, 5%, and 10% are calculated together with the corresponding lower confidence limit (BMDL).

bmdra <- BMDmarg(mdramod, respLev=c(1, 5, 10), 
                 interval="tfls", rfinterval=c(0,10), nGQ=5)

Broccoli example

The effect of drought stress on Brassica oleracea is investigated, selecting drought stress resistant varieties out of a population of different DH genotypes. The screening study was carried out on 48 DH lines developed from F1 plants of a cross between the rapid cycling chinese kale (Brassica oleracea var. alboglabra (L.H. Bailey) Musil) and broccoli (Brassica oleracea var. italica Plenck). Two stress treatments (not watered and a watered control) are randomly assigned to 4 plants per genotype (2 per treatment) resulting in 192 plants in total. For the genotypes 5, 17, 31, 48, additional 12 plants (6 per treatment) are included into the completely randomized design, which results in a total of 240 plants. For each plant the length of the youngest leaf at the beginning of the experiment is measured daily for a period of 16 days. For the additional 12 plants of the 4 genotypes the leaf water potential was measured as a secondary endpoint (omitted here); due to these destructive measurements some dropouts occur.

data(broccoli)
str(broccoli)

ggplot(broccoli, aes(x=Day, y=LeafLength, group=ID, colour=Stress)) +
  geom_line() +
  facet_wrap(~ Genotype, ncol=8)

There are two individuals for genotypes 42 and 43 with growth curves, which are not representative for the population of interest, as their leaf growth has stopped in favor of early flowering. These curves are removed from the dataset.

bro <- droplevels(subset(broccoli, ID != "110" & ID != "125"))

5 parameter logistic model

Let us assume that the growth curves for each individual follow a 4 parameter log-logistic model with different parameters for each stress treatment except the lower asymptote. The model function can be conveniently defined as a predefined function of package drc. The parameterization of different fixed effect effect vectors per treatment level can be defined by a curveid argument, with the name of a fixed effect factor in the dataset on the right hand side of a formula, and the parameters to change for different curves on the left hand side. The definition of random effects follow in a similar way. The random genotype effects are parameterized by a variance component for the upper asymptote, and the location of the inflection point. To model the dependency of observations measured at the same plant over time, a second set of random effects is assumed within the genotype effects to model the variance between individuals within genotypes.

For a simplification we ignore the real aim of the study to detect genotype specific stress effects and just model an additive stress effect for each genotype. The same lower asymptote (parameter c) is assumed for both stress treatment levels, as at the beginning of the experiment, all plants are watered at the same level.

m5pl <- medrm(LeafLength ~ Day, data=bro, 
              fct=L.5(), 
              curveid=b + d + e + f ~ Stress, 
              random=d + e ~ 1|Genotype/ID)

print(m5pl)

With the summary method, the summary output of the nlme component can be accessed. There are also some methods, like VarCorr, to access the specific nlme slots. For any special information about the nlme estimation, the nlme object is stored in a list slot named fit.

VarCorr(m5pl)
# same as
VarCorr(m5pl$fit)

The predicted random effects can be directly accessed by the function ranef().

re <- ranef(m5pl)[[1]]
head(re)

panellab <- function(x, y, ...){
  abline(h=0, lty=2)
  abline(v=0, lty=2)
  text(x, y, rownames(re))
}
pairs(re, panel = panellab)

The growth curves based just on the fixed effects of the model can also be plotted. This ggplot object can be extended by some further ggplot2 functions. A plot of curves, conditional on the random effects, is not yet implemented when several fixed effects curves are present.

plot(m5pl) +
  geom_line(data=bro, aes(group=ID), linetype=2, alpha=0.2) +
  theme_bw()  

Other diagnostic graphics, like residuals vs. fitted values, are also easily available.

plot(residuals(m5pl) ~ fitted(m5pl))
abline(h=0, lty=2, col="red3")

Sets of dose-response models

A set of different log-logistic and Weibull models can be fitted for the broccoli data to capture the uncertainty of the growth curve shape.

# 3 parameter logistic with lower asymptote fixed at 0
mod1 <- medrm(LeafLength ~ Day, data=bro, 
              fct=L.3(), 
              curveid=b + d + e ~ Stress, 
              random=d + e ~ 1|Genotype/ID)
# 4 parameter logistic 
# with the same lower asymptote for both stress treatments
mod2 <- medrm(LeafLength ~ Day, data=bro, 
              fct=L.4(fixed=c(NA, 5, NA, NA)), 
              curveid=b + d + e ~ Stress, 
              random=d + e ~ 1|Genotype/ID)
# 4 parameter Weibull model
mod3 <- medrm(LeafLength ~ Day, data=bro, 
              fct=W1.4(), 
              curveid=b + d + e ~ Stress, 
              random=d + e ~ 1|Genotype/ID)
# 2nd parameterization of 4 parameter Weibull model
mod4 <- medrm(LeafLength ~ Day, data=bro, 
              fct=W2.4(), 
              curveid=b + d + e ~ Stress, 
              random=d + e ~ 1|Genotype/ID)
# even a 4 parameter logistic with same parameters
# for both stress treatments is available with a onesided curveid formula
mod5 <- medrm(LeafLength ~ Day, data=bro, 
              fct=L.4(), 
              curveid= ~ Stress, 
              random=d + e ~ 1|Genotype/ID)
# or a 4p-log-logistic model with a lower asymptote fixed at 5
mod6 <- medrm(LeafLength ~ Day, data=bro, 
              fct=L.4(fixed=c(NA, 5, NA, NA)), 
              curveid=d + e ~ Stress, 
              random=d + e ~ 1|Genotype/ID)

The different curves can be displayed by the function mmplot().

mmplot(mod1, mod2, mod3, mod4, mod5, mod6, ndose=50)

Parameter inference

As the medrc class provides a coef and vcov method, simultaneous confidence intervals and multiple tests for the fixed effects are directly provided by the add-on packages multcomp. By multiple contrast tests, the comparison of drought stress to the control can be performed for each fixed effect parameter (except the lower asymptote).

library(multcomp)
K <- rbind("drought-control | b"=c(-1, 1,  0, 0,  0, 0,  0, 0,  0, 0),
           "drought-control | d"=c( 0, 0,  0, 0, -1, 1,  0, 0,  0, 0),
           "drought-control | e"=c( 0, 0,  0, 0,  0, 0, -1, 1,  0, 0),
           "drought-control | f"=c( 0, 0,  0, 0,  0, 0,  0, 0, -1, 1))
gg <- glht(m5pl, linfct=K)
summary(gg)

Inference for derived parameters

The growth curve of the broccoli leaf lengths can be summarized by the effective dose, or in our case by the estimated day, at which 50\% or any other percentage of the full length of a leaf is reached. Inference based on the fixed effects can be directly obtained by using the function ED in the drc package.

ED(m5pl, respLev=c(25, 50, 75), interval="delta")

Comparisons of these ED parameters can be made with the drc function EDcomp.

EDcomp(m5pl, percVec=c(25, 25), interval="delta")

When a set of models are available, the model-averaged ED estimates can be estimated by the medrc function mmaED(). The model specific estimates are weighted by the corresponding information criteria for each model, e.g. using Akaike weights. It is recommended to use maximum likelihood estimation instead of REML, and not mixing simultaneous changes of fixed and random effect structures between the models.

mmaED(mod1, mod2, mod3, mod4, mod5, mod6, respLev=c(25, 50, 75), interval="kang")

Marginal effective dose estimation

We might want to marginalize the ED of the broccoli model conditional on the estimated variance components by numerical integration methods. The medrc package provides functions EDmarg, but assuming only simple random effect structures.

EDmarg(m5pl, respLev=c(25, 50, 75), interval="delta", nGQ=3)

In this case there are only small changes compared to the fixed effects estimates; a difference can be observed, when the variance component for the steepness, the location of the inflection point increases, or a variance component for the asymmetry parameter is introduced into the model.

References



daniel-gerhard/medrc documentation built on May 14, 2019, 3:38 p.m.