Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/annotate.protein_id.R
This function assigns the protein identifier for a list of tandem mass specs having a peptide sequence assigned.
1 2 |
data |
list of records containing mZ and peptide sequences. |
file |
file name of a FASTA file. |
fasta |
a fasta object as returned by the |
digestPattern |
a regex pattern which can be used by the |
The protein sequences a read by the read.fasta
function
of the seqinr
package. The protein identifier is written
to the protein proteinInformation
variable.
If the function is called on a multi-core architecture it uses mclapply
.
It is recommended to load the FASTA file prior to running
annotate.protein_id
using
myFASTA <- read.fasta(file = file,
as.string = TRUE,
seqtype = "AA")
instead of providing the FASTA file name to the function.
it returns a list object.
Jonas Grossmann and Christian Panse, 2014
?read.fasta
of the seqinr
package.
http://www.uniprot.org/help/fasta-headers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # annotate.protein_id
# our Fasta sequence
irtFASTAseq <- paste(">zz|ZZ_FGCZCont0260|",
"iRT_Protein_with_AAAAK_spacers concatenated Biognosys\n",
"LGGNEQVTRAAAAKGAGSSEPVTGLDAKAAAAKVEATFGVDESNAKAAAAKYILAGVENS",
"KAAAAKTPVISGGPYEYRAAAAKTPVITGAPYEYRAAAAKDGLDAASYYAPVRAAAAKAD",
"VTPADFSEWSKAAAAKGTFIIDPGGVIRAAAAKGTFIIDPAAVIRAAAAKLFLQFGAQGS",
"PFLK\n")
# be realistic, do it from file
Tfile <- file(); cat(irtFASTAseq, file = Tfile);
#use read.fasta from seqinr
fasta.irtFASTAseq <-read.fasta(Tfile, as.string=TRUE, seqtype="AA")
close(Tfile)
#annotate with proteinID
# -> here we find all psms from the one proteinID above
peptideStd <- specL::annotate.protein_id(peptideStd,
fasta=fasta.irtFASTAseq)
#show indices for all PSMs where we have a proteinInformation
which(unlist(lapply(peptideStd,
function(x){nchar(x$proteinInformation)>0})))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.