R/psea2mass.R

Defines functions psea2mass

Documented in psea2mass

#' Translate PSEA results for Mass Spectrometry searching tools
#'
#' @description This function translates protein set enrihment analysis results
#'   and extracts the required information for mass spectometry searching tools.
#'   The subset of protein modifications is from
#'   <https://raw.githubusercontent.com/HUPO-PSI/psi-mod-CV/master/PSI-MOD.obo>.
#'
#' @param x A list of psea results generated by `runPSEA` function.
#' @param sig.level The significance level to filter PTMs (applies on adjusted
#'   p-value). Default value is 0.05
#' @param number.rep Only consider PTM terms that occurred more than a specific
#'   number of times in UniProt. This number is set by number.rep parameter. The
#'   default value is NULL.
#'
#' @return A database of subset of protein modifications:
#' - id: a unique identification for each subset of protein modifications, PSI-MOD.
#' - name: the name of modification.
#' - def: definition of PSI-MOD definition
#'
#' @export
#'
#' @import stringr
#'
#' @examples
#' # We recommend at least nperm = 1000.
#' # The number of permutations was reduced to 10
#' # to accommodate CRAN policy on examples (run time <= 5 seconds).
#' psea_res <- runPSEA(protein = exmplData2, os.name = 'Rattus norvegicus (Rat)', nperm = 10)
#' MS <- psea2mass(x = psea_res, sig.level = 0.05)
psea2mass = function(x, sig.level = 0.05, number.rep = NULL){

  temp    <- x[[1]] %>% filter( (nMoreExtreme / x[[6]]) <= sig.level )

  if( !is.null(number.rep) ){
    temp <- temp %>% filter(FreqinSample >= number.rep)
  }

  if(nrow(temp) == 0){
    stop('No PTMS passing number.rep threshold.')
  }

  pathway <- data.frame( PTM = as.character(temp$PTM), FreqinSample = temp$FreqinSample )

  res <- list()
  for( i in 1:nrow(pathway) ){

    escaped_pattern <- str_escape(pathway[i, "PTM"])
    indx <- which( str_detect(string = ptmlist$ID, pattern = escaped_pattern) )

    if( length(indx) != 0 ){
      res[[i]] <- cbind(ptmlist[indx,], pathway[i,2])
    }

  }


  res <- do.call(rbind.data.frame, res)
  colnames(res) <- c('ID', 'AC', 'KW', 'FT', 'MOD_ID', 'FreqinSample')
  res <- res %>% arrange(desc(FreqinSample))

  MOD_ID      <- res$MOD_ID
  pseaMS      <- mod_ont %>% filter(id %in% MOD_ID)
  pseaMS$def  <- str_replace(string = pseaMS$def, pattern = "A protein modification that effectively ", replacement = "")
  colnames(pseaMS)[1] <- 'MOD_ID'

  pseaMS <- pseaMS %>% left_join(res, 'MOD_ID') %>% select(c('MOD_ID', 'name', 'def', 'FreqinSample')) %>% arrange(desc(FreqinSample))

  return(pseaMS)

  # Source: https://raw.githubusercontent.com/HUPO-PSI/psi-mod-CV/master/PSI-MOD.obo

}

Try the PEIMAN2 package in your browser

Any scripts or data that you put into this service are public.

PEIMAN2 documentation built on April 11, 2025, 6:12 p.m.