Nothing
#' Chemical Variable Importance for Floating Percentile Model Benchmarks
#'
#' Generate statistics describing the relative importance of chemicals among benchmarks generated by \code{FPM}
#'
#' @param data data.frame containing, at a minimum, chemical concentrations as columns and a logical \code{Hit} column classifying toxicity
#' @param paramList character vector naming columns of \code{data} containing concentrations
#' @param ... additional arguments passed to \code{chemSig}, \code{chemSigSelect}, and \code{FPM}
#' @details The purpose of \code{chemVI} is to inform the user about the relative influence of each chemical over the sediment quality benchmarks generated by \code{FPM}.
#' Three statistics are generated: \code{chemDensity}, \code{MADP}, \code{dOR}, \code{dFM}, and \code{dMCC}. The \code{chemDensity} statistic (which is also generated by \code{FPM})
#' describes how little a particular chemical's value increased within the floating percentile model algorithm.
#' Low \code{chemDensity} (close to 0) means that the value was able to increase substantially within the algorithm without triggering one or more of the criteria for
#' stopping the algorithm (see \code{?FPM}), whereas high \code{chemDensity} (close to 1) indicates the final benchmark for that chemical did not float (increase)
#' much before being locked in. In other words, low \code{chemDensity} might be interpreted as relatively low importance. We caution against using this
#' metric in isolation, as it is the more difficult to interpret of the three.
#' The \code{MADP} statistic (or mean absolute difference percent) is calculated by sequentially dropping each chemical from consideration, recalculating the benchmarks
#' for the remaining chemicals, and then determining how much each benchmark changed (as a percent of the original value). Thus, the \code{MADP}
#' is a measure of a chemical's influence over other benchmarks. The \code{dOR} statistic is the difference between the overall reliability
#' of benchmarks with all chemicals versus without each chemical. \code{dFM} and \code{dMCC} are similar to the \code{dOR} statistic, but for the Fowlkes-Mallows Index
#' and Matthew's Correlation Coefficient. In any case, larger positive values indicate a greater impact of a chemical
#' on the overall predictive performance of floating percentile model benchmarks. Small values (close to 0) indicate low influence. Larger negative values indicate that
#' the chemical actually adversely impacts toxicity predictions. If there are chemicals with negative values, consider reevaluting the data without the associated chemical
#' or using \code{optimFPM} or \code{cvFPM} to optimize the overall reliability prior to running \code{FPM} and \code{chemVI}.
#'
#' @seealso chemSig, chemSigSelect, optimFPM, cvFPM, FPM
#' @return data.frame with 2 columns
#' @examples
#' paramList = c("Cd", "Cu", "Fe", "Mn", "Ni", "Pb", "Zn")
#' chemVI(h.tristate, paramList, testType = "np")
#' chemVI(h.tristate, paramList, testType = "p")
#' @export
chemVI <- function(data,
paramList,
...){
fpm <- FPM(data, paramList, densInfo = T, ...)
pL <- names(fpm[["FPM"]])[1:(length(fpm[["FPM"]]) - 12)]
fpm.SQB <- fpm[["FPM"]][pL]
fpm.STAT <- fpm[["FPM"]][c("sens", "spec", "OR", "FM", "MCC")]
tmp <- list()
tmp.SQB <- list()
tmp.STAT <- list()
for (i in 1:length(pL)){
pL.i <- pL[-i]
tmp[[i]] <- FPM(data, paramList = pL.i, paramOverride = T, ...)[["FPM"]]
tmp.SQB[[i]] <- tmp[[i]][pL.i]
tmp.STAT[[i]] <- tmp[[i]][c("sens", "spec", "OR", "FM", "MCC")]
}
tmp2 <- list(); tmp3 <- list(); tmp4 <- list(); tmp5 <- list()
for (i in 1:length(tmp)){
tmp2[[i]] <- 100 * mean(as.numeric((tmp.SQB[[i]] - fpm.SQB[-i])/fpm.SQB[-i]))
tmp3[[i]] <- 100 * (tmp.STAT[[i]]$OR - fpm.STAT$OR)/fpm.STAT$OR
tmp4[[i]] <- 100 * (tmp.STAT[[i]]$FM - fpm.STAT$FM)/fpm.STAT$FM
tmp5[[i]] <- 100 * (tmp.STAT[[i]]$MCC - fpm.STAT$MCC)/fpm.STAT$MCC
}
tmp2 <- data.frame(tmp2); names(tmp2) <- pL
tmp3 <- data.frame(tmp3); names(tmp3) <- pL
tmp4 <- data.frame(tmp4); names(tmp4) <- pL
tmp5 <- data.frame(tmp5); names(tmp5) <- pL
x <- do.call(rbind, list(round(100 * fpm[["chemDensity"]], 3), round(tmp2, 3), round(tmp3, 3), round(tmp4, 3), round(tmp5, 3)))
row.names(x) <- c("chemDensity", "MADP", "dOR", "dFM", "dMCC")
return(t(x))
}## end code
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.