Outputproseq: output FASTA format file contains proteins that have...

Description Usage Arguments Details Value Author(s) Examples

Description

Get the FASTA file of proteins that pass RPKM cutoff. the FASTA ID line contains protein ID, gene ID, HGNC symbol and description

Usage

1
Outputproseq(rpkm, cutoff = "30%", proteinseq, outfile, ids, ...)

Arguments

rpkm

a numeric vector containing RPKM for each protein

cutoff

cutoff of RPKM value. Two options are available, percentage format or RPKM. By default we use "30%" or the RPKM value of 1. "30%" means we keep top 70% proteins according to their RPKMs.

proteinseq

a dataframe containing protein ids and protein sequences.

outfile

output file name.

ids

a dataframe containing gene/transcript/protein id mapping information.

...

additional arguments

Details

by taking the RPKM value as input, the function outputs sequences of the proteins that pass the cutoff.

Value

FASTA file contains proteins with RPKM above the cutoff.

Author(s)

Xiaojing Wang

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
load(system.file("extdata/refseq", "exon_anno.RData", package="customProDB"))
load(system.file("extdata/refseq", "proseq.RData", package="customProDB"))
bamFile <- system.file("extdata/bams", "test1_sort.bam",
    package="customProDB")
load(system.file("extdata/refseq", "ids.RData", package="customProDB"))
RPKM <- calculateRPKM(bamFile, exon, proteincodingonly=TRUE, ids)
outf1 <- paste(tempdir(), '/test_rpkm.fasta', sep='')
Outputproseq(RPKM, 1, proteinseq, outf1, ids)

Outputproseq(NULL, 1, proteinseq, outf1, ids)

chambm/customProDB documentation built on May 31, 2019, 12:08 p.m.