signalp: predict signal peptides
In gogleva/SecretSanta: flexible pipelines for secretome prediction

Description Usage Arguments Value See Also Examples

This function calls the command line tool signalp to predict the presence and location of signal peptide cleavage sites in amino acid sequences.

Large input files (>500 sequnces) are automatically split into smaller chunks so that signalp prediction can be run as an embarassingly parallel process on a specified number of cores.

1
2
3

signalp(input_obj, version, organism = c("euk", "gram+", "gram-"),
  run_mode = c("starter", "piper"), paths = NULL, truncate = NULL,
  cores = 1, sensitive = FALSE, legacy_method)

`input_obj`	an instance of class `CBSResult` containing protein sequences as one of the attributes
`version`	version of `signalp`, supported versions include: `signalp2`, `signalp3`, `signalp4`.
`organism`	a character string with the following options: `organism = "euk"` - for eukaryotes `organism = "gram+"` - for gram-positive bacteria `organism = "gram-"` - for gram-negative bacteria
`run_mode`	a character string with the following options: `run_mode = "starter"` - if it is the first step in pipeline `run_mode = "piper"` - if you run this function on the output of other CBS tools
`paths`	if required version of `signalp` is not acessible globally, a file conatining a full path to it's executable should be provided; for details please check SecretSanta vignette.
`truncate`	a logical indicating: `truncate = TRUE` - sequences longer 2000 residues will be truncated to this length limit and renamed `truncate = FALSE` - long sequences will be excluded from the analysis Default is `truncate = TRUE`.
`cores`	number of cores for multicore processing. Default is `cores = 1`.
`sensitive`	optional argument, a logical indicating: `sensitive = TRUE` - if SignalP version 4.1 is used, it will be run in sensitive mode (). `sensitive = FALSE` - if SignalP version 4.1 is used, it will be run with the default cut-off. Default is `sensitive = FALSE`. For more details about SignalP 4.1 sensitive mode please see http://www.cbs.dtu.dk/services/SignalP/performance.php
`legacy_method`	optional argument, which prediction method to use when running SiganlP 2.0 and SignalP 3.0: `legacy_method = "hmm"` - for HMM-based predictions `legacy_method = "nn"` - for prediction based on neural networks

an object of SignalpResult class

parse_signalp

# read fasta file in AAStringSet object
aa <- readAAStringSet(system.file("extdata", "sample_prot_100.fasta",
package = "SecretSanta"))
# assign this object to the input_fasta slot
# of empty CBSResult object
inp <- CBSResult(in_fasta = aa[1:10])
# run signalp2 on the initial file:
r1 <- signalp(inp, version = 2, organism = 'euk', run_mode = "starter",
legacy_method = 'hmm')
r4 <- signalp(inp, version = 4, organism = 'euk', run_mode = "starter")
r4_sensitive <- signalp(inp, version = 4.1, organism = 'euk', run_mode = 'starter')