sequenceCheck: Sequence Check Function

View source: R/sequenceCheck.R

sequenceCheckR Documentation

Sequence Check Function

Description

This is used validate a sequence of amino acids. It can additionally be used to load an amino acid sequence. It can also be used to coerce a sequence into a specific format.

Usage

sequenceCheck(
  sequence,
  method = "stop",
  outputType = "string",
  nonstandardResidues = NA,
  suppressAAWarning = FALSE,
  suppressOutputMessage = FALSE
)

Arguments

sequence

amino acid sequence as a single character string, a vector of single characters, or an AAString object. It also supports a single character string that specifies the path to a .fasta or .fa file.

method

Required Setting. method = c("stop", "warn"). "stop" by default. "stop" Reports invalid residues as an error and prevents the function from continuing. "warn" Reports invalid residues through a warning Any invalid sequences will be reported as intended.

outputType

Required Setting. "string" By default. outputType = c("string", "vector", "none") "string" returns the sequence as a single string of amino acids. "vector" returns the sequence as a vector of individual characters. "none" prevents the function from returning a sequence.

nonstandardResidues

Optional setting. Expands the amino acid alphabet. NA or Character vector required. Default values are "ACDEFGHIKLMNPQRSTVWY". Additional letters added here. nonstandardResidues = c("O,U") to allow Pyrrolysine (O) and Selenocysteine (U).

suppressAAWarning

If using nonstandardResidues, a warning will be issued. set nonstandardResidues = T to confirm addition of non-standard residues.

suppressOutputMessage

Set suppressOutputMessage = T to prevent sequence validity message

Value

A message and sequence are returned. If suppressOutputMessage = T, the message is not returned. If outputType = "None"), the sequence is not returned. Otherwise, outputType will determine the format of the returned sequence. If the sequence contains an error, it will be reported based on the value of method. The Sequence will be assigned to the value "Sequence" if sequenceName is not specified. Otherwise the sequence is assigned to the value of sequenceName. This allows the sequences to be called by the user.

Examples

#Amino acid sequences can be character strings
aaString <- "ACDEFGHIKLMNPQRSTVWY"
#Amino acid sequences can also be character vectors
aaVector <- c("A", "C", "D", "E", "F",
           "G", "H", "I", "K", "L",
           "M", "N", "P", "Q", "R",
           "S", "T", "V", "W", "Y")
#Alternatively, .fasta files can also be used by providing
##The path to the file as a character string
## Not run: 
sequenceCheck(aaString)
sequenceCheck(aaVector)


#To allow O and U
sequenceCheck(aaString,
           nonstandardResidues = c("O", "U"),
           suppressAAWarning = TRUE)

#To turn off output message
sequenceCheck(aaString,
           suppressOutputMessage = TRUE)

#To change string to be a vector
sequenceCheck(aaString,
           outputType = "vector")

#To not return a sequence but check the input
sequenceCheck(aaVector,
            outputType = "none")

## End(Not run)


wmm27/idpr documentation built on Jan. 12, 2023, 8:45 a.m.