R/nv.R
In visualFields: Statistical Methods for Visual Fields

Documented in agelm gdef ghdef lutdef lutgdef nvgenerate pddef tddef

#' @rdname nv
#' @title Normative values generation and management
#' @description Functions to generate and handle normative values. Check section
#' \code{Structure of normative values} below for details about how to generate
#' functioning normative values
#' @section Structure of normative values:
#' This is one of the most complex structures in visualFields. It is necessary
#' to be able to run statistical analyses of visual fields obtained from
#' perimetry and it requires data from healthy eyes for its generation. The
#' normative values are only as good as the data they are generated from. Two
#' common ways to generate full normative values from a dataset of healthy eyes,
#' are provided in the package, depending on the \code{method} selected. The first
#' one, \code{method="pointwise"}, generates normative values directly from
#' pointwise statistics. The second one, \code{method="smooth"}, uses a 2D
#' quadratic functions to smooth out those pointwise statistics. Variations
#' or improvements can be regenerated by copying the code in those functions and
#' editing it.
#' \itemize{
#'   \item\code{info} information regarding normative values. Info is not necessary
#'     to carry out statistics, but is useful for the generation of reports. The
#'     fields need not be the same as the ones listed here, although these are used
#'     in the reports in \code{\link{vfsfa}} for single field analysis and
#'     \code{\link{vfspa}} for series progression analysis.
#'     \itemize{
#'       \item\code{name} name of the normative values
#'       \item\code{perimetry} perimetry device for which normative values are intended
#'       \item\code{strategy} psychophysical strategy
#'       \item\code{size} stimulus size, e.g. Goldmann size III, size V
#'     }
#'   \item\code{agem} The normative values' age model. The default methods' generate
#'     age linear models with coefficients for each location in \code{locmap} in
#'     \code{coeff} and the function definining the model in \code{model}
#'   \item\code{sd} standard deviations of the sensitivities, \code{s}, total
#'   deviation (TD) values, \code{td}, and pattern deviation (PD) values, \code{pd}
#'   \item\code{luts} Lookup tables to obtain probability levels for TD and PD values.
#'     \itemize{
#'       \item\code{probs} probability levels
#'       \item\code{td}, \code{pd} lookup tables for TD and PD values at each location
#'         in \code{locmaps}
#'       \item\code{global} lookup table for the following global visual field indices
#'       \itemize{
#'         \item\code{ms} mean sensitivity (MS) calculated as the unweithed average
#'           over locations' values
#'         \item\code{ss} standard deviation of sensitivity calculated as the
#'           unweithed standard deviation over locations' values
#'         \item\code{md} mean deviation (MD) calculated as the weithed average
#'           over locations' values. Weights are the inverse of the standard
#'           deviation in \code{sd} for TD at each location.
#'         \item\code{sd} standard deviation of total deviation calculated as the
#'           weithed standard deviation over locations' values. Weights are the inverse of the standard
#'           deviation in \code{sd} for TD at each location.
#'         \item\code{pmd} pattern mean deviation calculated as the weithed average
#'           over locations' values. Weights are the inverse of the standard
#'           deviation in \code{sd} for PD at each location.
#'         \item\code{psd} pattern standard deviation calculated as the weithed
#'           standard deviation over locations' values. Weights are the inverse
#'           of the standard deviation in \code{sd} for PD at each location.
#'         \item\code{gh} general height. This is defined traditionally for the 24-2
#'         and the 30-2 as the approximatelly the 85th percentile of TD values
#'         \item\code{vfi} the oddly defined visual field index
#'       }
#'     }
#'   \item\code{tdfun} a function defining how to obtain the TD values. Typically, it
#'     is a function of age and sensitivity values and it is defined as sensitivity
#'     values minus the age-corrected mean normal obtained as defined in \code{agem}.
#'     Thus, TD values are negative is visual field sensitivity values are below
#'     mean normal and positive if they are above mean normal
#'   \item\code{ghfun} a function defining how to obtain the general height
#'   \item\code{pdfun} a function defining how to obtain the PD values. Tipically, they
#'     are obtaines as the TD values minus the general height
#'   \item\code{glfun} a function defining how to obtain different global indices
#'   \item\code{tdpfun}, \code{pdpfun}, \code{glpfun} mapping functions to get
#'     the probability levels corresponding to TD, PD and global indices values and
#'     based on the lookup tables defined in \code{luts}
#' }
#' @examples
#' # generate normative values from SUNY-IU dataset of healthy eyes
#' # pointwise
#' sunyiu_24d2_pw <- nvgenerate(vfctrSunyiu24d2, method = "pointwise",
#'                              name      = "SUNY-IU pointwise NVs",
#'                              perimetry = "static automated perimetry",
#'                              strategy  = "SITA standard",
#'                              size      = "Size III")
#' # smooth
#' sunyiu_24d2 <- nvgenerate(vfctrSunyiu24d2, method = "smooth",
#'                           name      = "SUNY-IU smoothed NVs",
#'                           perimetry = "static automated perimetry",
#'                           strategy  = "SITA standard",
#'                           size      = "Size III")
#' @param vf visual fields data
#' @param method method to generate normative values, pointwise (`\code{pointwise}`)
#'   or smoothed with 2-dimensional quadratic functions (`\code{smooth}`)
#' @param probs numeric vector of probabilities with values in \code{[0,1]}.
#'   The values 0 and 1 must be included
#' @param name name for the normative values, e.g., "SUNY-IU pointwise NVs".
#'   Default is blank
#' @param perimetry perimetry used to obtain normative data, e.g.,
#' "static automated perimetry" (default)
#' @param strategy psychophysical strategy used to obtain threshold values, e.g.,
#'   "SITA standard". Default is blank
#' @param size stimulus size, if the same size was used for all visual field
#'   locations or empty (default)
#' @param agem Age model
#' @param tdfun Function for the calculation of total deviation maps
#' @param ghfun Function to obtain an estimate of the general height
#' @param pdfun Function for the calculation of pattern deviation maps
#' @return \code{nvgenerate} returns a list with normative values
#' @export
nvgenerate <- function(vf, method = "pointwise",
                       probs = c(0, 0.005, 0.01, 0.02, 0.05, 0.95, 0.98, 0.99, 0.995, 1),
                       name = "", perimetry = "static automated perimetry", strategy = "", size = "",
                       agem = agelm(vf), tdfun = tddef(agem), ghfun = ghdef(0.85), pdfun = pddef(ghfun)) {
  if(method != "pointwise" && method != "smooth")
    stop("wrong method for generating normative values")
  if(any(probs < 0) || any(probs > 1))
    stop("probability values must be between 0 and 1")
  if(!any(probs == 0) || !any(probs == 1))
    stop("probability values must include values 0 and 1")
  probs <- sort(probs)
  info <- list(name = name, perimetry = perimetry, strategy = strategy, size = size)
  if(method == "smooth") { # smooth out intercepts and slopes
    agem$coeff$intercept <- vfsmooth(agem$coeff$intercept)
    agem$coeff$slope     <- vfsmooth(agem$coeff$slope)
    environment(agem$model)$coeff <- t(as.matrix(agem$coeff))
  }
  # get TD and PD values, ...
  td <- tdfun(vf)
  pd <- pdfun(td)
  # define mapping functions to get probability levels
  luts <- list(probs = probs, td = NA, pd = NA, g = NA) # prepare list to 
  tdplutfun <- lutdef(td, probs, "quantile")
  pdplutfun <- lutdef(pd, probs, "quantile")
  luts$td   <- tdplutfun$lut
  tdpfun    <- tdplutfun$fun
  luts$pd   <- pdplutfun$lut
  pdpfun    <- pdplutfun$fun
  # ... and return probability levels
  tdp <- tdpfun(td)
  pdp <- pdpfun(pd)
  # compute weighted standard deviations. Those for td and pd will be used to
  # estimate weighted averages and standard deviations
  sdev <- vfsd(vf, td, pd)
  if(method == "smooth") { # smooth out standard deviations
    sdev$vf <- vfsmooth(sdev$vf)
    sdev$td <- vfsmooth(sdev$td)
    sdev$pd <- vfsmooth(sdev$pd)
  }
  # define global indices
  glfun <- gdef(agem, sdev$td, sdev$pd)
  # ... and return them
  gl <- glfun(vf, td, pd, tdp, pdp, ghfun(td))
  # define mapping functions to get probability levels for global indices
  gplutfun <- lutgdef(gl, probs, "quantile")
  luts$g   <- gplutfun$lut
  glpfun   <- gplutfun$fun
  # return the list defining the normative values
  return(list(info = info, agem = agem, sd = sdev, luts = luts,
              tdfun = tdfun, tdpfun = tdpfun, pdfun = pdfun, pdpfun = pdpfun,
              ghfun = ghfun, glfun = glfun, glpfun = glpfun))
}

#' @rdname nv
#' @param vf visual field data with sensitivity values
#' @return \code{agelm} returns a list with coefficients and a function defining
#' a linear age model
#' @export
agelm <- function(vf) {
  # get weights so that each subject is counted the same as the rest
  counts <- table(vf$id)
  w <- as.numeric(1 / rep(counts, counts)) / length(unique(vf$id))
  # get age and sensitivity values
  age <- vf$age                    # get age
  vf <- vf[,.vfenv$locini:ncol(vf)] # get senstivivities
  coeff <- sapply(as.list(vf), function(y) lm(y ~ age, weights = w)$coefficients)
  if(length(getlocmap()$bs) > 0) # for locations near the blind spot
    coeff[,getlocmap()$bs] <- NA # slope and intercept must be undefined
  model <- as.function(alist(age = , cbind(rep(1, length(age)), age) %*% coeff))
  agem <- list(coeff = as.data.frame(t(coeff)), model = model)
  names(agem$coeff) <- c("intercept", "slope")
  return(agem)
}

#' @rdname nv
#' @param agem age model to construct the function to obtain TD values
#' @return \code{tddef} returns a function for the computation of TD values
#' @export
tddef <- function(agem)
  return(as.function(alist(vf = , {
    vf[,getvfcols()] <- vf[,getvfcols()] - agem$model(vf$age)
    return(vf)
  })))

#' @rdname nv
#' @param perc the percentile to obtain the ranked TD value as
#' reference for the general height (GH) of the visual field.
#' Default is the 85th percentile, thus \code{0.85}
#' @return \code{ghdef} returns a function for the computation of the general height
#' @export
ghdef <- function(perc = 0.85)
  as.function(alist(td = , {
    # get on the TD values and remove all other information columns
    td <- td[,getvfcols()]
    # columns with all NA values are removed as they must be those
    # too close to the blind spot for analysis
    td <- td[,apply(!is.na(td), 2, all)]
    pos <- floor((1 - perc) * ncol(td)) # which sorted location to return
    ### CARE: This works for 24-2, where pos is 7, it works for 10-2 where
    ### pos is 10, but for 30-2 it returns location 11, not 10. This can be
    ### fixed if perc = 0.86
    return(as.numeric(apply(td, 1, sort, decreasing = TRUE)[pos,]))
  }))

#' @rdname nv
#' @param ghfun function used for determination of the GH and PD values
#' @return \code{pddef} returns a function for the computation of PD values
#' @export
pddef <- function(ghfun = ghdef(0.85))
  return(as.function(alist(td = , {
    td[,getvfcols()] <- td[,getvfcols()] - ghfun(td)
    return(td)
  })))

#' @rdname nv
#' @param type type of estimation for the weighted quantile values. See
#'   \code{wtd.quantile} for details. Default is `\code{quantile}`
#' @param ... arguments to be passed to or from methods
#' @return \code{lutdef} returns a look up table and a function for the
#' computation of the probability values for TD and PD
#' @export
lutdef <- function(vf, probs, type = "quantile", ...) {
  counts <- table(vf$id)
  w <- as.numeric(1 / rep(counts, counts))
  vf <- vf[,getvfcols()] # remove info fields
  nacols <- which(apply(is.na(vf), 2, all))
  vf[,nacols] <- 0
  # probability level look up table
  lut <- apply(vf, 2, wtd.quantile, type = type, na.rm = TRUE,
               weights = w, normwt = FALSE, probs = probs, ...)
  row.names(lut) <- as.character(probs)
  lut[,nacols] <- as.numeric(NA) # put back NAs in blind spot locations
  lut[1,] <- -Inf # theoretical min for quantile 0 is minus infinite
  lut[nrow(lut),] <- +Inf # theoretical min for quantile 0 is plus infinite
  fun <- as.function(alist(vf = , {
    lev <- probs # levels
    vals  <- vf[,getvfcols()]
    valsp <- as.data.frame(matrix(as.numeric(NA), nrow(vals), ncol(vals)))
    names(valsp) <- names(vals)
    for(i in nrow(lut):2) {
      co <- matrix(rep(lut[i,], nrow(vals)), nrow(vals), ncol(vals), byrow = TRUE)
      valsp[vals < co] <- lev[i]
    }
    vf[,getvfcols()] <- valsp
    return(vf)
  }))
  return(list(lut = as.data.frame(lut), fun = fun))
}

#' @rdname nv
#' @param sdtd standard deviations obtained for TD values
#' @param sdpd standard deviations obtained for PD values
#' @return \code{gdef} returns a function to compute global indices
#' @export
gdef <- function(agem, sdtd, sdpd) {
  coord <- getlocmap()$coord
  bs    <- getlocmap()$bs
  if(length(bs) > 0) { # remove blind spot
    sdtd  <- sdtd[-bs]
    sdpd  <- sdpd[-bs]
    coord <- coord[-bs,]
  }
  wtd <- 1 / sdtd
  wpd <- 1 / sdpd
  fun <- as.function(alist(vf = , td = , pd = , tdp = , pdp = , gh = , {
    vfinfo <- vf[,1:(getlocini() - 1)] # keep key to add to the g table later on
    vf     <- vf[,getvfcols()] # remove info columns
    td     <- td[,getvfcols()]
    pd     <- pd[,getvfcols()]
    tdp    <- tdp[,getvfcols()]
    pdp    <- pdp[,getvfcols()]
    mnsens <- agem$model(vfinfo$age)
    gh     <- gh # silly patch so that it check does not complain about gh
    if(length(bs) > 0) { # remove blind spot
      vf     <- vf[,-bs]
      td     <- td[,-bs]
      pd     <- pd[,-bs]
      tdp    <- tdp[,-bs]
      pdp    <- pdp[,-bs]
      mnsens <- mnsens[,-bs]
    }
    msens <- apply(vf, 1, mean)
    ssens <- apply(vf, 1, sd)
    tmd   <- apply(td, 1, wtd.mean, weights = wtd / sum(wtd))
    tsd   <- sqrt(apply(td, 1, wtd.var, weights = wtd))
    pmd   <- apply(pd, 1, wtd.mean, weights = wpd)
    psd   <- sqrt(apply(pd, 1, wtd.var, weights = wpd))
    vfi   <- vfcomputevfi(coord, tmd, mnsens, vf, td, pd, tdp, pdp)
    g <- data.frame(msens = msens, ssens = ssens, tmd = tmd,
                    tsd = tsd, pmd = pmd, psd = psd, gh = gh, vfi = vfi)
    return(cbind(vfinfo, g))
  }))
  return(fun)
}

#' @rdname nv
#' @param g a table with global indices
#' @return \code{lutgdef} returns a look up table and a function for the
#' computation of the probability values for global indices
#' @export
lutgdef <- function(g, probs, type = "quantile", ...) {
  # CARE the g in the parent and child functions are
  # different, the g at parents level is used to compute
  # the empirical quantiles. The function then returns another
  # function with input argument g, which can be table with any
  # TD or PD values, which should be of the same type (perimeter,
  # location maps, etc), get weights so that each subject is
  # counted the same as the rest
  counts <- table(g$id)
  w <- as.numeric(1 / rep(counts, counts))
  g <- g[,getlocini():ncol(g)] # remove key fields
  nacols <- which(apply(is.na(g), 2, all))
  g[,nacols] <- 0
  # probability level look up table
  lut <- matrix(NA, length(probs), ncol(g))
  colnames(lut) <- names(g)
  idxm <- c(1, 3, 5, 7, 8)
  idxs <- c(2, 4, 6)
  lut[,idxm] <- apply(g[,idxm], 2, wtd.quantile,
                      type = type, na.rm = TRUE, weights = w,
                      normwt = FALSE, probs = probs, ...)
  lut[,idxs] <- apply(g[,idxs], 2, wtd.quantile,
                      type = type, na.rm = TRUE, weights = w,
                      normwt = FALSE, probs = 1 - probs, ...)
  rownames(lut) <- paste0(probs, "%")
  lut[,nacols] <- NA  # put back NAs in blind spot locations
  lut[1,idxm] <- -Inf # theoretical min for quantile 0 is minus infinite
  lut[nrow(lut),idxm] <- +Inf # theoretical min for quantile 0 is plus infinite
  lut[1,idxs] <- +Inf # theoretical min for quantile 0 is minus infinite
  lut[nrow(lut),idxs] <- -Inf # theoretical min for quantile 0 is plus infinite
  fun <- as.function(alist(g = , {
    lev <- probs # levels
    vfinfo <- g[,1:(getlocini() - 1)]
    g <- g[,getlocini():ncol(g)]
    gp <- as.data.frame(matrix(NA, nrow(g), ncol(g)))
    names(gp) <- names(g)
      # analysis for means (greater is better) is different than for SDs (greater is worse)
    gm   <- g[,idxm]
    lutm <- lut[,idxm]
    gpm  <- gp[,idxm]
    gs   <- g[,idxs]
    luts <- lut[,idxs]
    gps  <- gp[,idxs]
    for(i in nrow(lut):2) {
      com <- matrix(rep(lutm[i,], nrow(gm)), nrow(gm), ncol(gm), byrow = TRUE)
      cos <- matrix(rep(luts[i,], nrow(gs)), nrow(gs), ncol(gs), byrow = TRUE)
      gpm[gm < com] <- lev[i]
      gps[gs > cos] <- lev[i]
    }
    gp[,idxm] <- gpm
    gp[,idxs] <- gps
    return(cbind(vfinfo, gp))
  }))
  return(list(lut = as.data.frame(lut), fun = fun))
}

###################################################################################
# INTERNAL FUNCTIONS: routines to ease load on code generating normative values
###################################################################################
#' @noRd
vfsmooth <- function(vals) {
  x <- getlocmap()$coord$x
  y <- getlocmap()$coord$y
  vals[!is.na(vals)] <- predict(lm(vals ~ x + y + I(x^2) + I(y^2)))
  return(vals)
}

#' @noRd
vfsd <- function(vf, td, pd) {
  counts <- table(vf$id)
  w <- as.numeric(1 / rep(counts, counts))
  vf <- vf[,getvfcols()] # remove info fields
  td <- td[,getvfcols()]
  pd <- pd[,getvfcols()]
  # get blind spot
  bs <- getlocmap()$bs
  # blindspot columns to zero, so wtd.var does not crash
  if(length(bs) > 0) vf[, bs] <- td[, bs] <- pd[, bs] <- 0
  vfsd <- sqrt(apply(vf, 2, wtd.var, weights = w, normwt = FALSE))
  tdsd <- sqrt(apply(td, 2, wtd.var, weights = w, normwt = FALSE))
  pdsd <- sqrt(apply(pd, 2, wtd.var, weights = w, normwt = FALSE))
  # blindspot columns must not have standard deviations
  if(length(bs) > 0) vfsd[bs] <- tdsd[bs] <- pdsd[bs] <- NA
  return(data.frame(vf = vfsd, td = tdsd, pd = pdsd))
}

#' @noRd
vfcomputevfi <- function(coord, tmd, mnsens, vf, td, pd, tdp, pdp) {
  d <- sqrt(apply(coord^2, 1, sum))
  w <- 1 / (0.08 * (d + 0.8)) # parameters used for observer J.M. in Figure 14
  # which is what seems to be most consistent between Levi et al VR 1985 (see
  # page 975). It is also possible that other parameters were used, but I
  # cannot figure it out. Here is the logic that led me to use the the current
  # formulation to calculate the weights for the VFI.
  #
  # w0 <- NA
  # w0[d < 6] <- 3.29
  # w0[d > 6 & d < 12] <- 1.28
  # w0[d > 12 & d < 18] <- 0.79
  # w0[d > 18 & d < 24] <- 0.57
  # w0[d > 24 & d < 30] <- 0.45
  # plot(coord$x,coord$y, typ = "n")
  # text(coord$x,coord$y, w0)
  # lm((1 / w0) ~ d) # gives intercept 0.02 and slope 0.08
  # 0.02 / 0.08 # it returns 0.25, which is not the parameter for
  #             # cortical processing X = 2.5 proposed by Levi et al
  #             # The other parameter X = 0.8 for retinal processing
  #             # is closer and is the parameter they should have
  #             # chosen for the experiments which results are shown
  #             # in their Figure 14
  # plot(d, 1 / w0, xlim = c(0, 30), ylim = c(0, 2.5))
  # dlm <- c(0,30)
  # lines(dlm, 0.08 * (dlm + 0.8), col = "red")
  # 
  # References:
  # D. M. Levi, S. A. Klein, and A. P. Aitsebaomo. Vernier acuity,
  # crowding and cortical magnification. Vision Research,
  # 25(7):963–977, 1985
  # 
  # B. Bengtsson and A. Heijl. A visual field index for calculation
  # of glaucoma rate of progression. American Journal of Ophthalmology,
  # 145(2):343–353, 2008
  
  # Now that we have our weights, even if different from Bengsston and Heijl
  # we can compute the VFI
  vfiloc <- 100 * (1 - abs(td) / mnsens)
  # if within normal limits, then score is 100
  # if MD is greater than -20 dB, look at PD probability levels
  usePD <- matrix(rep(tmd >= -20, ncol(tdp)), nrow(tdp), ncol(tdp))
  vfiloc[usePD & pdp > 0.05] <- 100
  # else, look at TD probability levels
  vfiloc[!usePD & tdp > 0.05] <- 100
  # non-seen locations have a score of 0
  vfiloc[vf < 0] <- 0
  return(apply(vfiloc, 1, wtd.mean, weights = w)) # return weighted sum of scores
}