In walmes/RDASC: Reproducible Data Analysis of Cientific Cooperations

Session Definition

# https://github.com/walmes/wzRfun
# devtools::install_github("walmes/wzRfun")

library(lattice)
library(latticeExtra)
library(plyr)
library(lme4)
library(lmerTest)
library(doBy)
library(multcomp)
library(wzRfun)

library(RDASC)

source("config/setup.R")
tbn_ <- captioner(prefix = "Table")
fgn_ <- captioner(prefix = "Figure")
tbl_ <- function(label) tbn_(label, display = "cite")
fgl_ <- function(label) fgn_(label, display = "cite")

Exploratory Analysis

data(ake_b)
str(ake_b$sensitivity)

# Short object names are handy.
sen <- ake_b$sensitivity

# Divide diameters by 100 to convert to milimeters. Calculate mean
# diameter (dm).
sen <- within(sen, {
    # 500 is the plug size.
    d1 <- (d1 - 500)/100
    d2 <- (d2 - 500)/100
    dm <- (d1 + d2)/2
    ar <- pi * (dm/2)^2
})

# Population along years.
xtabs(~pop + yr, data = sen)

# Number of observations per factor combination.
ftable(xtabs(~yr + hed + tra + fun, data = sen))

# Number of isolates per factor combination.
with(sen,
     ftable(tapply(iso,
                   INDEX = list(yr, hed, tra, fun),
                   FUN = function(x) {
                       length(unique(x))
                   })))

# Isolates x fungicide missing cells.
xt <- xtabs(complete.cases(cbind(dm, dos)) ~ iso + fun,
            data = sen)
i <- xt == 0
sum(i)
xt[rowSums(i) > 0, ]

xyplot(d2 ~ d1,
       data = sen,
       aspect = "iso",
       xlab = "First diameter (mm)",
       ylab = "Second diameter (mm)") +
    layer(panel.abline(a = 0, b = 1))

xyplot(d2 ~ d1 | iso,
       data = sen,
       aspect = "iso",
       groups = fun,
       strip = FALSE,
       as.table = TRUE,
       xlab = "First diameter (mm)",
       ylab = "Second diameter (mm)") +
    layer(panel.abline(a = 0, b = 1))

cap <-
"Scatter plots of mean diamenter as function of
 dose grouped by *in vitro* fungicide in natual scale
 (top) and log-log scale (bottom)."
cap <- fgn_("dm-x-dos", cap)

# Natural scales.
xyplot(dm ~ dos | iso,
       data = sen,
       groups = fun,
       type = c("p", "a"),
       strip = FALSE,
       as.table = TRUE,
       ylab = "Mean diameter (mm)",
       xlab = "Fungicide dose")

# Log-log scales.
xyplot(dm ~ dos | iso,
       scales = list(log = TRUE),
       data = sen,
       groups = fun,
       type = c("p", "a"),
       strip = FALSE,
       as.table = TRUE,
       ylab = "log Mean diameter (mm)",
       xlab = "log Fungicide dose")

cap <-
"Scatter plot of mean diamenter as function of
 dose power 1/5 grouped by *in vitro* fungicide."
cap <- fgn_("dm-x-dos0.2", cap)

# 5th root of dose.
xyplot(dm ~ dos^0.2 | iso,
       data = sen,
       groups = fun,
       type = c("p", "a"),
       strip = FALSE,
       as.table = TRUE,
       ylab = "Mean diameter (mm)",
       xlab = "5th root fungicide dose")

r fgl_("dm-x-dos") (top) shows that doses are very skewed. r fgl_("dm-x-dos") (bottom) shows that in the log-log scale there isn't a linear relation between mean diameter and fungicide dose.

r fgl_("dm-x-dos0.2") shows that, under the transformed 5th-root scale, doses levels are close to equally spaced. In fact, 0.2 was found by minimization of the variance of distance between doses ($\sigma^2$) a power transformation ($p$) of dose rescaled to a unit interval as described by the steps below $$ \begin{align} z_i &= x_i^p\ u_i &= \frac{z_i - \min(z)}{\max(z) -\min(z)}, \text{then } u_i \in [0, 1] \ d_i &= u_{i+1} - u_{i}\ \bar{d} &= \sum_{i=1}^{k-1} d_i/k \ \sigma^2 &= \sum_{i=1}^{k-1} \frac{(d_i - \bar{d})^2}{k-2} \end{align} $$ where $x$ are doses in natural scale, $z$ are doses power transformed, $u$ are scaled to a unit interval, $d$ are diferences between doses, $\bar{d}$ is the mean difference and $\sigma^2$ is the variance of differences.

# Unique fungicide dose levels.
x <- sort(unique(sen$dos))
x

# Variance of distance between doses scaled to a unit interval.
esp <- function(p) {
    u <- x^p
    u <- (u - min(u))
    u <- u/max(u)
    var(diff(u))
}

# Optimise de power parameter to the most equally spaced set.
op <- optimize(f = esp, interval = c(0, 1))
op$minimum

p <- seq(0, 1, by = 0.01)
v <- sapply(p, esp)
plot(log(v) ~ p, type = "o")
abline(v = op$minimum)

So $x^{0.2}$ is the most equally spaced set obtained with a power transformation. Equally spaced levels are beneficial beacause reduce problems related to leverage.

Half Effective Concentration (EC~50~) Estimation

A cubic spline is function constructed of piecewise third-order polynomials which pass through a set of $m + 1$ knots. These knots spans the observed domain of the continous factor $x$, so the set of knots is $$ K = {\xi_0, \xi_1, \ldots, \xi_m}. $$

A function $s(x)$ is a cubic spline if it is made of cubic polynomials $s_{i}(x)$ in each interval $[x_{i-1}, x_{m}]$, $i = 1, \ldots, m$.

Those adjacent cubic pylinomials pieces must bind and be smooth at the internal knots , so additional constrais are made to result in a composite continuous smooth function. Requering continous derivatives, we ensure that the resulting function is as smooth as possible.

For natural splines, two aditional boundary conditions are made $$ s^{''}{1}(x) = 0, \quad s^{''}{m}(x) = 0, $$ that is, the pieces at borders aren't cubic but instead linear.

Natural cubic splines were used to estimate the half effective concentration (EC~50~). A non linear model is usually applied in this context but wasn't found a non linear model flexible enough to give a good fit either a satisfactory convergence rate. So, despite splines haven't a model equation, they are vey flexible and numerical root-finding algorithms can e used to compute EG~50~ based on a linear interpolated function on a predicted grid. Also, area under the sensibility curve (AUSC) were computed by numerical integration under studied dose domain.

sen$iso <- factor(sen$iso, levels = sort(unique(sen$iso)))
sen$ue <- with(sen, interaction(iso, fun, drop = TRUE))
sen$doz <- sen$dos^0.2

# A data frame without dose.
senu <- unique(subset(sen,
                      select = c(ue, iso, fun, tra,
                                 hed, pop, yr, plot)))

# Splines.
library(splines)

# Function for fitting splines.
fitting <- function(x) {
    x <- na.omit(x)
    if (nrow(x) > 4) {
        n0 <- lm(dm ~ ns(doz, df = 3), data = x)
        yfit <- predict(n0, newdata = pred)
        # n0 <- glm(dm + 0.01 ~ ns(doz, df = 3), data = x,
        #           family = gaussian(link = log))
        # yfit <- predict(n0, newdata = pred, type = "response")
        # yr <- range(x$dm)
        return(cbind(xfit = pred$doz, yfit = yfit))
    } else {
        NULL
    }
}

# Values of dose to predict diameter.
pred <- data.frame(doz = seq(0, max(sen$doz), length.out = 30))

# Appling to all experimental unit.
res <- ddply(sen, .(ue), .fun = fitting)

# Merge to pair `iso` and `fun`.
res <- merge(res, senu, by = "ue")

# Estimatinf EC50 and area under curve.
get_ec50 <- function(x, y, interval = c(0, max(sen$doz))) {
    fa <- approxfun(x = x, y = y)
    int <- integrate(fa,
                     lower = 0,
                     upper = max(sen$doz))$value
    mxm <- optimize(fa, interval = interval, maximum = TRUE)
    ymid <- mxm[[2]]/2
    u <- try(uniroot(f = function(x) fa(x) - ymid,
                     interval = interval),
             silent = TRUE)
    if (class(u) == "try-error") {
        return(c(ec50 = NA, ev50 = NA, auc = int))
    }
    else {
        return(c(ec50 = u$root, ev50 = ymid, auc = int))
    }
}

# EC50 and AUC for each experimental unit.
ec <- ddply(res, .(ue), .fun = function(x) get_ec50(x$xfit, x$yfit))
str(ec)

# Proportion of not estimated EC50 and AUC.
cbind(AUC = sum(is.na(ec$auc)),
      EC50 = sum(is.na(ec$ev50)))/nrow(ec)

# Scatter plot matrix of estimated values.
splom(ec[, -1], type = c("p", "smooth"), col.line = 2)

cap <-
"Area under the fitted curve as a function of the EC~50~ for those isolates where EC~50~ were estimated."
cap <- fgn_("aux-vs-ec50", cap)

xyplot(auc ~ ec50,
       data = ec,
       type = c("p", "smooth"),
       ylab = "Area under the fitted curve",
       xlab = expression("Estimated" ~ EC[50]))

# Correlation among variables.
cor(ec[, -1], use = "complete")

# Merge to pair `iso` and `fun`.
ec <- merge(ec, senu, by = "ue", all = TRUE)
str(ec)

cap <-
"Mean diameter as function of fungicide dose 5th root for each isolate.  Solid line is the natural cubic spline fitted for each fungicide in each isolate. Gray straight line indicates the EC~50~ on each curve."
cap <- fgn_("splines-fit", cap)

# View the results.
L <- split(ec, ec$iso)

xyplot(dm ~ dos^0.2 | iso,
       data = sen,
       groups = fun,
       cex = 0.4,
       strip = FALSE,
       as.table = TRUE,
       auto.key = list(columns = 3,
                       title = "In vitro fungicide",
                       cex.title = 1.1),
       xlab = "In vitro fungicide dose 5th root",
       ylab = "Mean diameter (mm)") +
    as.layer(xyplot(yfit ~ xfit | iso,
                    groups = fun,
                    data = res,
                    type = "l")) +
    layer({
        with(L[[which.packet()]], {
            cl <- trellis.par.get()$superpose.symbol$col[as.integer(fun)]
            panel.segments(x0 = ec50,
                           y0 = ev50,
                           x1 = ec50,
                           y1 = 0,
                           col = "gray50")
            panel.segments(x0 = ec50,
                           y0 = ev50,
                           x1 = 0,
                           y1 = ev50,
                           col = "gray50")
            panel.points(x = ec50,
                         y = ev50,
                         pch = 19,
                         cex = 0.6,
                         col = cl)
        })
    })

cap <-
"Mean diameter as function of fungicide dose 5th root for a random sample of isolates.  Solid line is the natural cubic spline fitted for each fungicide in each isolate. Gray straight line indicates the EC~50~ on each curve."
cap <- fgn_("splines-fit-sample", cap)

# Subseting.
set.seed(123)
i <- sample(levels(sen$iso), size = 40)
i <- as.character(sort(as.integer(i)))
L <- L[i]

xyplot(dm ~ dos^0.2 | iso,
       data = sen,
       subset = iso %in% i,
       groups = fun,
       cex = 0.4,
       strip = FALSE,
       as.table = TRUE,
       auto.key = list(columns = 3,
                       title = "In vitro fungicide",
                       cex.title = 1.1),
       xlab = "In vitro fungicide dose 5th root",
       ylab = "Mean diameter (mm)") +
    as.layer(xyplot(yfit ~ xfit | iso,
                    groups = fun,
                    data = res,
                    subset = iso %in% i,
                    type = "l")) +
    layer({
        with(L[[which.packet()]], {
            cl <- trellis.par.get()$superpose.symbol$col[as.integer(fun)]
            panel.segments(x0 = ec50,
                           y0 = ev50,
                           x1 = ec50,
                           y1 = 0,
                           col = "gray50")
            panel.segments(x0 = ec50,
                           y0 = ev50,
                           x1 = 0,
                           y1 = ev50,
                           col = "gray50")
            panel.points(x = ec50,
                         y = ev50,
                         pch = 19,
                         cex = 0.6,
                         col = cl)
        })
    })

Analysis of Area Under Sensitivity Curve

p <- xyplot(ec50 ~ fun | pop + hed,
            groups = tra,
            data = ec[!is.na(ec$ec50), ],
            type = c("p", "a"))
useOuterStrips(p)

p <- xyplot(auc ~ fun | pop + hed,
            groups = tra,
            data = ec[!is.na(ec$auc), ],
            type = c("p", "a"))
useOuterStrips(p)

p <- xyplot(auc ~ tra | pop + hed,
            groups = fun,
            data = ec[!is.na(ec$auc), ],
            type = c("p", "a"))
useOuterStrips(p)

p <- xyplot(auc ~ pop | tra + fun,
            groups = hed,
            data = ec[!is.na(ec$auc), ],
            type = c("p", "a"))
useOuterStrips(p)

#-----------------------------------------------------------------------
# Creates block and treatment cell factors.

ec$blk <- factor(as.integer(as.integer(substr(ec$plot, 0, 1)) > 2))
ec$cell <- with(ec, interaction(yr, blk, hed, tra, drop = TRUE))

# Number of isolates per cell combination.
ftable(xtabs(~pop + hed + tra, data = ec))/3

# ddply(ec,
#       ~yr + pop + tra + hed,
#       function(x) {
#           nlevels(droplevels(x$iso))
#       })

ec <- arrange(df = ec, yr, blk, hed, tra, iso, fun)
str(ec)

2015

#-----------------------------------------------------------------------

ec15 <- subset(ec, yr == 2015)

# Mixed effects model.
m0 <- lmer(auc ~ blk + (1 | iso) + (pop + tra + hed + fun)^2,
           data = ec15,
           REML = FALSE)

# r <- residuals(m0)
# f <- fitted(m0)
# useOuterStrips(qqmath(~r | pop + tra, data = ec15))
# useOuterStrips(xyplot(r ~ f| pop + tra, data = ec15))

# Wald tests for the fixed effects.
anova(m0)

# A simpler model.
m1 <- update(m0, auc ~ blk + (1 | iso) + (pop + tra + fun))

# LRT between nested models.
anova(m1, m0)

# Parameter estimates.
summary(m1)

# Least squares means.
i <- c("pop", "tra", "fun")
L <- lapply(i,
       FUN = function(term){
           L <- LSmeans(m1, effect = term)
           rownames(L$L) <- L$grid[, 1]
           a <- apmc(L$L, m1, focus = term)
           names(a)[1] <- "level"
           a <- cbind(term = term, a)
           return(a)
       })
res <- ldply(L)
# str(res)

i <- c("Population",
       expression("Fungicide" ~ italic("in vivo")),
       expression("Fungicide" ~ italic("in vitro")))

cap <-
"Area under isolate sensitivity curve for levels of population, *in vivo* fungicide and *in vitro* fungicide. Pairs of means in a factor followed by the same letter are not statistically different at 5% significance level."
cap <- fgn_("auc-2015", cap)

resizePanels(
    segplot(level ~ lwr + upr | term,
            centers = fit,
            data = res,
            as.table = TRUE,
            draw = FALSE,
            layout = c(1, NA),
            scales = list(y = list(relation = "free")),
            xlab = "Area under isolate sensitivity curve",
            ylab = "Levels of each factor",
            strip = strip.custom(factor.levels = i),
            cld = res$cld) +
    layer(panel.text(x = centers,
                     y = z,
                     labels = sprintf("%0.1f %s", centers, cld),
                     pos = 3)),
    h = sapply(L, nrow)
)

2016

#-----------------------------------------------------------------------

ec16 <- subset(ec, yr == 2016)

# Mixed effects model.
m0 <- lmer(auc ~ blk + (1 | iso) + (pop + tra + hed + fun)^2,
           data = ec16,
           REML = FALSE)

# r <- residuals(m0)
# f <- fitted(m0)
# useOuterStrips(qqmath(~r | pop + tra, data = ec15))
# useOuterStrips(xyplot(r ~ f| pop + tra, data = ec15))

# Wald tests for the fixed effects.
anova(m0)

# A simpler model.
m1 <- update(m0, auc ~ blk + (1 | iso) + pop * (tra + fun))

# LRT between nested models.
anova(m1, m0)

# Parameter estimates.
summary(m1)

# Least squares means.
res <- vector(mode = "list", length = 2)

L <- LSmeans(m1, effect = c("pop", "tra"))
g <- L$grid
L <- by(L$L, L$grid$tra, as.matrix)
L <- lapply(L, "rownames<-", unique(g$pop))
L <- lapply(L, apmc, model = m1, focus = "pop")
res[[1]] <- ldply(L, .id = "tra")

L <- LSmeans(m1, effect = c("pop", "fun"))
g <- L$grid
L <- by(L$L, L$grid$fun, as.matrix)
L <- lapply(L, "rownames<-", unique(g$pop))
L <- lapply(L, apmc, model = m1, focus = "pop")
res[[2]] <- ldply(L, .id = "fun")

L <- lapply(res,
            FUN = function(x) {
                x$by <- names(x)[1]
                names(x)[1] <- "term"
                return(x)
            })
res <- ldply(L)
res <- arrange(res, by, term, pop)

i <- c(expression("Fungicide" ~ italic("in vitro")),
       expression("Fungicide" ~ italic("in vivo")))
p <- c(1, 2)

cap <-
"Area under isolate sensitivity curve for combination between population and *in vitro* fungicide (top) and population and *in vivo* fungicide (bottom). Pairs of means comparing populations in each row of the plot (each factor level at the y-axis) followed by the same letter are not statistically different at 5% significance level."
cap <- fgn_("auc-2016", cap)

resizePanels(
    segplot(term ~ lwr + upr | by,
            centers = fit,
            groups = pop,
            data = res,
            draw = FALSE,
            layout = c(1, NA),
            scales = list(y = list(relation = "free")),
            xlab = "Area under isolate sensitivity curve",
            ylab = "Levels of each factor",
            strip = strip.custom(factor.levels = i),
            key = list(title = "Population",
                       cex.title = 1.1,
                       columns = 2,
                       type = "b",
                       divide = 1,
                       lines = list(pch = p, lty = 1),
                       text = list(levels(res$pop))),
            cld = res$cld,
            panel = panel.groups.segplot,
            pch = p[as.integer(res$pop)],
            gap = 0.05) +
    layer(panel.text(x = centers[subscripts],
                     y = as.integer(z[subscripts]) +
                         centfac(groups[subscripts],
                                 space = gap),
                     labels = sprintf("%0.1f %s",
                                      centers[subscripts],
                                      cld[subscripts]),
                     pos = 3)),
    h = c(6, 8)
)

Session information

cat(format(Sys.time(), format = "%A, %d de %B de %Y, %H:%M"),
    "----------------------------------------", sep = "\n")
sessionInfo()

walmes/RDASC documentation built on Jan. 10, 2021, 8:02 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

walmes/RDASC
Reproducible Data Analysis of Cientific Cooperations

In walmes/RDASC: Reproducible Data Analysis of Cientific Cooperations

Session Definition

Exploratory Analysis

Half Effective Concentration (EC~50~) Estimation

Analysis of Area Under Sensitivity Curve

2015

2016

Session information

R Package Documentation

Browse R Packages

We want your feedback!

walmes/RDASC Reproducible Data Analysis of Cientific Cooperations

In walmes/RDASC: Reproducible Data Analysis of Cientific Cooperations

Session Definition

Exploratory Analysis

Half Effective Concentration (EC~50~) Estimation

Analysis of Area Under Sensitivity Curve

2015

2016

Session information

R Package Documentation

Browse R Packages

We want your feedback!

walmes/RDASC
Reproducible Data Analysis of Cientific Cooperations