R/ranki.R
In rYWAASB: Simultaneous Selection by Trait and WAASB Index

Documented in ranki

#' @name ranki
#' @title The values and ranks of genotypes
#' @author {
#' Ali Arminian <abeyran@gmail.com>
#' }
#' @description
#' `r lifecycle::badge("stable")`
#'
#' `ranki()` function ranks the genotypes (or entries) based on
#' a new index utilizing the given trait and "WAASB" index to
#' simultaneous select the top-ranked ones. This can be compared
#' with WAASBY index of Olivoto (2019). We suggest users handle
#' the missing data in inputs before considering analyses,
#' due rank codes dose not implement a widespread algorithm
#' to do this task.
#' WAASB(Weighted Average of Absolute Scores), Computes
#' the Weighted Average of Absolute Scores (Olivoto et al.,
#' 2019) for quantifying the stability of *g* genotypes
#' conducted in *e* environments using linear mixed-effect models.
#'
#' @details
#' According to Olivoto et al. (2019a), WAASB(The weighted average of
#' absolute scores) is computed considering all Interaction Principal
#' Component Axis (IPCA) from the Singular Value Decomposition (SVD)
#' of the matrix of genotype-environment interaction (GEI) effects
#' generated by a linear mixed-effect model, as follows:
#'
#' \loadmathjax
#' \mjsdeqn{
#' WAASB_i = \sum_{k = 1}^{p} |IPCA_{ik} \times EP_k|/
#' \sum_{k = 1}^{p}EP_k}
#'
#' where \mjseqn{WAASB_i} is the weighted average of absolute scores
#' of the *i*th genotype; \mjseqn{IPCA_{ik}} is the score of the *i*th
#' genotype in the *k*th Interaction Principal Component Axis (IPCA);
#' and \mjseqn{EP_k} is the explained variance of the *k*th IPCA for
#' *k = 1,2,..,p*, considering \mjseqn{p=min(g-1; e-1)}.
#'
#' Further, \mjseqn{WAASBY_i} is a superiority or simultaneous
#' selection index allowing weighting between mean performance
#' and stability
#' \mjsdeqn{
#' WAASBY_i=\frac{\left({rY}_i\times\theta_Y\right)+
#' \left({rW}_i\times\theta_s\right)}{\theta_Y+\theta_s}
#' }
#' , where \mjseqn{WAASBY_i} is the superiority index for genotype
#' \mjseqn{\it{i}} that weights between mean performance and stability;
#' \mjseqn{\theta_Y} and \mjseqn{\theta_s} are the weights for
#' mean performance and stability, respectively; \mjseqn{{rY_i}} and
#' \mjseqn{{rW}_i} are the rescaled values for mean performance
#' \mjseqn{\bar{Y_i}} and stability \mjseqn{W_i}, respectively of
#' the genotype *i*. For the details of calculations, rescaling
#' and mathematics notations see (Olivoto et al., 2019).
#'
#' Finally, \mjseqn{rYWAASB_i} index is the sum of the ranks
#' (or in fact the rank of sum of ranks of  the trait and
#' WAASB index) as follows:
#' (\mjseqn{rY_i}) and WAASB index (\mjseqn{rWAASB_i}) for each
#' individual:
#'
#' \mjsdeqn{
#' rYWAASB_i = {rY_i} + {rWAASB_i}} or: =
#' \mjsdeqn{rank{{rY_i} + {rWAASB_i}}}.
#'
#' The input format of table of data(NA free), here *maize* data,
#' should be as follows:
#'
#' \tabular{rrrr}{
#' **GEN** \tab **Y** \tab **WAASB** \tab **WAASBY**\cr
#'  Dracma \tab 262.22 \tab 0.81 \tab 81.6\cr
#'  DKC6630 \tab 284.04 \tab 2.20 \tab 88.5\cr
#'  NS770 \tab 243.48 \tab 0.33 \tab 71.4\cr
#'    ...
#' }
#'
#' @param datap The data set
#' @param lowt A parameter indicating whether lower rates of the trait
#' is preferred or not. For grain yield e.g. Upper values is preferred. For plant height
#' lower values e.g. is preferred.
#' @references
#' Olivoto, T., Lúcio, A., DC, da Silva, J.A.G., Sari, B.G.
#' and Diel, M. 2019. Mean performance and stability in
#' multi-environment trials II: Selection based on multiple
#' traits. Agronomy Journal, 111(6):2961-2969.
#'
#' Olivoto, T., & Lúcio, A.D.C.2020. metan: An R package for
#' multi‐environment trial analysis. Methods in Ecology and
#' Evolution, 11(6), 783-789.
#'
#' Kang, M.S. 1988. “A Rank-Sum Method for Selecting High-Yielding,
#' Stable Corn Genotypes.” Cereal Research Communications 16: 113–15.
#' @return Returns a data frame showing numerical rankings
#' @usage ranki(datap,  lowt = FALSE)
#' @examples
#' # Case 1:  Higher trait values are preferred. For instance grain yield
#' # in cereals is a trait which its higher values are preferred and ranking
#' # is performed from the higher to lower values i.e. 1st, 2nd, 3rd etc
#' # in maize dataset.
#' \donttest{
#' data(maize)
#' ranki(maize) # or: ranki(maize, lowt = FALSE)
#' }
#' @examples
#' # Case 2:  In this case, the lower values of the given trait are preferred.
#' # For instance days to maturity (dm) and plant height are traits where their
#' # lower values are preferred.
#' \donttest{
#' data(dm)
#' ranki(dm, lowt = TRUE)
#' }
#' @export

ranki <- function(datap, lowt = FALSE)
{
  datap <- data.frame(datap)

  for (i in seq_along(datap))
  {
    if(is.numeric(datap[, i])) {
      datap[, i][is.na(datap[, i])] <- mean(datap[, i], na.rm = TRUE)
      datap[, i] <- as.numeric(unlist(datap[, i]))
    }
  }

  n = length(datap)
  datap$GEN <- factor(datap$GEN, levels = datap$GEN)
  if (lowt) {
    datap$rY <- rank(datap$Y, na.last = NA, ties.method = "average")
    datap$rWAASBY <- rank(datap$WAASBY, na.last = NA, ties.method = "average")
  } else {
    datap$rY <- rank(-datap$Y, na.last = NA, ties.method = "average")
    datap$rWAASBY <- rank(-datap$WAASBY, na.last = NA, ties.method = "average")
  }
  datap$rWAASB <- rank(datap$WAASB, na.last = NA, ties.method = "average")
  datap$"rY+rWAASB" <- datap$rY +  datap$rWAASB
  datap$rYWAASB <- rank(datap$"rY+rWAASB", na.last = NA, ties.method = "average")

  colnames(datap) <- c("GEN", "Y=Trait", "WAASB", "WAASBY", "rY", "rWAASB", "rWAASBY", "rY+rWAASB", "rYWAASB")

  class(ranki) <- "data.frame"

  return(datap)
}