get_targetp: Query TargetP web server.

Description Usage Arguments Value Note Source References See Also Examples

Description

TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal presequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). TargetP uses ChloroP and SignalP to predict cleavage sites for cTP and SP, respectively. For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
get_targetp(data, ...)

## S3 method for class 'character'
get_targetp(
  data,
  org_type = c("non_plant", "plant"),
  cutoffs = c("winner_takes_all", "spec95", "spec90", "custom"),
  tcut = NULL,
  pcut = NULL,
  scut = NULL,
  ocut = NULL,
  splitter = 1000,
  attempts = 2,
  progress = FALSE,
  ...
)

## S3 method for class 'data.frame'
get_targetp(data, sequence, id, ...)

## S3 method for class 'list'
get_targetp(data, ...)

## Default S3 method:
get_targetp(data = NULL, sequence, id, ...)

## S3 method for class 'AAStringSet'
get_targetp(data, ...)

Arguments

data

A data frame with protein amino acid sequences as strings in one column and corresponding id's in another. Alternatively a path to a .fasta file with protein sequences. Alternatively a list with elements of class SeqFastaAA resulting from read.fasta call. Alternatively an AAStringSet object. Should be left blank if vectors are provided to sequence and id arguments.

...

currently no additional arguments are accepted apart the ones documented bellow.

org_type

One of c("non_plant", "plant"), defaults to "plant". Which models should be used for prediction.

cutoffs

One of c("winner_takes_all", "spec95", "spec90", "custom"), defaults to "winner_takes_all". If "winner_takes_all" no cutoffs are specified, if "spec95" or "spec90" are selected, predefined set of cutoffs that yielded >0.95 or >0.9 specificity on the TargetP test sets. If "custom" specified user defined cutoffs should be specified in "tcut", "pcut", "scut", ocut".

tcut

A numeric value, with range 0 - 1, defaults to 0 (cutoff = "winner_takes_all"). mTP user specified cutoff.

pcut

A numeric value, with range 0 - 1, defaults to 0 (cutoff = "winner_takes_all"). cTP user specified cutoff.

scut

A numeric value, with range 0 - 1, defaults to 0 (cutoff = "winner_takes_all"). SP user specified cutoff.

ocut

A numeric value, with range 0 - 1, defaults to 0 (cutoff = "winner_takes_all"). User specified cutoff for "other" (not with mTP, cTP, SP).

splitter

An integer indicating the number of sequences to be in each .fasta file that is to be sent to the server. Defaults to 1000. Change only in case of a server side error. Accepted values are in range of 1 to 2000.

attempts

Integer, number of attempts if server unresponsive, at default set to 2.

progress

Boolean, whether to show the progress bar, at default set to FALSE.

sequence

A vector of strings representing protein amino acid sequences, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

id

A vector of strings representing protein identifiers, or the appropriate column name if a data.frame is supplied to data argument. If .fasta file path, or list with elements of class "SeqFastaAA" provided to data, this should be left blank.

Value

A data frame with columns:

Name

Character, sequence name truncated to 20 characters

Len

Integer, length of analyzed sequence

cTP

Numeric, final NN sequence score to contain a chloroplast transit peptide (cTP). ChloroP is used to predict cleavage sites for cTP

mTP

Numeric, final NN sequence score to contain a mitochondrial targeting peptide (mTP).

SP

Numeric, final NN sequence score to contain a secretory pathway signal peptide (SP). SignalP is used to predict cleavage sites for SP

other

Numeric, final NN sequence score of a sequence not to contain mTP, SP or cTP

Loc

Character, one of C (chloroplast), M (mitochondrion), S (secretory pathway), - (any other location) or * ("don't know"; indicates that cutoff restrictions were set and the winning network output score was below the requested cutoff for that category.)

TPlen

Integer, predicted presequence length

RC

Integer, reliability class, from 1 to 5, where 1 indicates the strongest prediction. RC is a measure of the size of the difference ('diff') between the highest (winning) and the second highest output scores. There are 5 reliability classes, defined as follows: 1 : diff > 0.800, 2 : 0.800 > diff > 0.600, 3 : 0.600 > diff > 0.400, 4 : 0.400 > diff > 0.200, 5 : 0.200 > diff. Thus, the lower the value of RC the safer the prediction.

is.targetp

Logical, did TargetP predict the presence of a signal peptide

Note

This function creates temporary files in the working directory. Protein ids should be shorter then 20 characters due to server side truncation.

Source

https://services.healthtech.dtu.dk/service.php?TargetP-1.1

References

Emanuelsson O, Nielsen H, Brunak S,von Heijne G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol.300: 1005-1016

See Also

get_signalp get_signalp5

Examples

1
2
3
4
5
6
7
library(ragp)
data(at_nsp)
targetp_pred <- get_targetp(at_nsp[1:20,],
                            sequence,
                            Transcript.id)
targetp_pred     
                      

missuse/ragp documentation built on Jan. 4, 2022, 10:49 a.m.