R/import_agvgdweb.R

Defines functions read_agvgdweb_results

Documented in read_agvgdweb_results

#' Read in AGVGD Web Results
#'
#' This function imports into R the results generated by the AGVGD Web
#' application <http://agvgd.hci.utah.edu/>.
#'
#' @param file A file path to the results output by the AGVGD Web app.
#' @param alignment A character matrix or an alignment object obtained with
#'   [read_alignment()]. Rows are expected to be sequences of single characters
#'   (protein residues), and columns the alignment positions. The first row must
#'   be the reference sequence, i.e. the sequence whose substitutions will be
#'   evaluated against. This parameter can be left `NULL`. If supplied the
#'   column `poi` in the output will be filled in (default is to be `NA`).
#'
#' @return  A [tibble][tibble::tibble-package] of seven columns:
#' \describe{
#'   \item{res}{Position of the amino acid residue in the reference protein
#'   (first sequence in the alignment). This position corresponds to `poi` minus
#'   the gaps in the alignment.}
#'   \item{poi}{Position of interest, i.e. the alignment position at which the
#'   amino acid substitution is being assessed. Because this information is not
#'   provided by AGVGD Web app this column is always `NA`; we keep it though for
#'   coherence with the output of `agvgd()`.}
#'   \item{ref}{Reference amino acid, i.e. the amino acid in the first sequence
#'   of the alignment, at the position of interest.}
#'   \item{sub}{Amino acid substitution being assessed.}
#'   \item{gv}{Grantham variation score.}
#'   \item{gd}{Grantham difference score.}
#' \item{prediction}{Predicted effect of the amino acid substitution. This is
#' classed as C0, C15, C25, C35, C45, C55, or C65, with C65 most likely to
#' interfere with function and C0 least likely.}
#' }
#'
#' @md
#' @importFrom rlang .data
#' @keywords internal
#' @export
read_agvgdweb_results <- function(file = stop('`file` is missing'), alignment = NULL) {

  df <- utils::read.delim(file)
  tbl <- tibble::as_tibble(df)

  tbl <-
    tbl %>%
    dplyr::mutate(parse_substitutions(.data$Substitution)) %>%
    dplyr::select(-'Substitution') %>%
    dplyr::rename(gv = .data$GV,
                  gd = .data$GD,
                  prediction = .data$Prediction) %>%
    dplyr::relocate(.data$res,
                    .data$poi,
                    .data$ref,
                    .data$sub,
                    .data$gv,
                    .data$gd,
                    .data$prediction) %>%
    dplyr::mutate(prediction = stringr::str_remove(.data$prediction, '^Class '))

  # If an alignment is supplied then it is possible to determine `poi` from the
  # residude position `res`:
  if(!is.null(alignment)) {
    tbl$poi <- res_to_poi(alignment = alignment, res = tbl$res)
  }

  tbl
}
maialab/agvgd documentation built on Jan. 10, 2024, 6:08 p.m.