signalp: predict signal peptides

Description Usage Arguments Value See Also Examples

View source: R/signalp_parallel.R

Description

This function calls the command line tool signalp to predict the presence and location of signal peptide cleavage sites in amino acid sequences.

Large input files (>500 sequnces) are automatically split into smaller chunks so that signalp prediction can be run as an embarassingly parallel process on a specified number of cores.

Usage

1
2
3
signalp(input_obj, version, organism = c("euk", "gram+", "gram-"),
  run_mode = c("starter", "piper"), paths = NULL, truncate = NULL,
  cores = 1, sensitive = FALSE, legacy_method)

Arguments

input_obj

an instance of class CBSResult containing protein sequences as one of the attributes

version

version of signalp, supported versions include:
signalp2, signalp3, signalp4.

organism

a character string with the following options:

  • organism = "euk" - for eukaryotes

  • organism = "gram+" - for gram-positive bacteria

  • organism = "gram-" - for gram-negative bacteria

run_mode

a character string with the following options:

  • run_mode = "starter" - if it is the first step in pipeline

  • run_mode = "piper" - if you run this function on the output of other CBS tools

paths

if required version of signalp is not acessible globally, a file conatining a full path to it's executable should be provided; for details please check SecretSanta vignette.

truncate

a logical indicating:

  • truncate = TRUE - sequences longer 2000 residues will be truncated to this length limit and renamed

  • truncate = FALSE - long sequences will be excluded from the analysis

Default is truncate = TRUE.

cores

number of cores for multicore processing. Default is cores = 1.

sensitive

optional argument, a logical indicating:

  • sensitive = TRUE - if SignalP version 4.1 is used, it will be run in sensitive mode ().

  • sensitive = FALSE - if SignalP version 4.1 is used, it will be run with the default cut-off.

Default is sensitive = FALSE. For more details about SignalP 4.1 sensitive mode please see http://www.cbs.dtu.dk/services/SignalP/performance.php

legacy_method

optional argument, which prediction method to use when running SiganlP 2.0 and SignalP 3.0:

  • legacy_method = "hmm" - for HMM-based predictions

  • legacy_method = "nn" - for prediction based on neural networks

Value

an object of SignalpResult class

See Also

parse_signalp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# read fasta file in AAStringSet object
aa <- readAAStringSet(system.file("extdata", "sample_prot_100.fasta",
package = "SecretSanta"))
# assign this object to the input_fasta slot
# of empty CBSResult object
inp <- CBSResult(in_fasta = aa[1:10])
# run signalp2 on the initial file:
r1 <- signalp(inp, version = 2, organism = 'euk', run_mode = "starter",
legacy_method = 'hmm')
r4 <- signalp(inp, version = 4, organism = 'euk', run_mode = "starter")
r4_sensitive <- signalp(inp, version = 4.1, organism = 'euk', run_mode = 'starter')

gogleva/SecretSanta documentation built on May 30, 2019, 8:02 a.m.