asebProteins: prediction of all lysine sites on a specific protein that can...

Description Usage Arguments Details Value Note References See Also Examples

Description

This function is used to predict all lysine sites on a specific protein that can be acetylated by a specific KAT-family.

Usage

1
2
3
4
5
asebProteins(backgroundSites, prodefinedSites, testProteins, outputFile=NULL, permutationTimes=10000)
## S4 method for signature 'character,character,character'
asebProteins(backgroundSites, prodefinedSites, testProteins, outputFile=NULL, permutationTimes=10000)
## S4 method for signature 'SequenceInfo,SequenceInfo,SequenceInfo'
asebProteins(backgroundSites, prodefinedSites, testProteins, outputFile=NULL, permutationTimes=10000)

Arguments

backgroundSites

SequenceInfo object or file name (character(1)) for background peptides set.

prodefinedSites

SequenceInfo object or file name (character(1)) for KAT special peptides set.

testProteins

SequenceInfo object or file name (character(1)) for query Proteins set.

outputFile

file name for output (character(1)).

permutationTimes

permutation times (integer(1)), default and recommended: 10000.

Details

This function is used to predict lysine sites that can be acetylated by a specific KAT-family. The whole process is similar with the GSEA method (permuting gene sets). Please see the references for details.

The first three arguments of method asebProteins can be SequenceInfo objects or file names. If these arguments are SequenceInfo objects, this method returns a list to the users besides an output file. Otherwise, this method processes the FASTA format files directly and outputs all results to a file. In this case, this method can process huge number of sites each time without loading any sequences to R workspace.

Value

The output file contains enrichment scores and P-values for each query site. The asebProteins,SequenceInfo,SequenceInfo,SequenceInfo-method also returns a list contains two data.frame objects: results and curveInfo.

results

contains enrichment scores and P-values for each query site.

curveInfo

contains information for enrichment score curves.

Note

The acetylated lysine sites and their surrounding amino acids (8 on each side) are treated as acetylated peptides.
Example for peptides sequence : "KEHDDIFDKLKEAVKEE".
All input file should follow FASTA format.

References

Subramanian, A. et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A, 102, 15545-15550.

Mootha, V.K. et al. (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet, 34, 267-273.

Guttman, M. et al. (2009) Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature, 458, 223-227.

Li, T.T. et al. Characterization and prediction of lysine (K)-acetyl-transferase (KAT) specific acetylation sites. Mol Cell Proteomics, in press.

See Also

SequenceInfo, readSequence, asebSites, drawStat, drawEScurve.

Examples

1
2
3
4
5
    backgroundSites <- readSequence(system.file("extdata", "background_sites.fa", package="ASEB")) 
    prodefinedSites <- readSequence(system.file("extdata", "predefined_sites.fa", package="ASEB"))
    testProteins <- readSequence(system.file("extdata", "proteins_to_test.fa", package="ASEB"))
    resultList <- asebProteins(backgroundSites, prodefinedSites, testProteins, permutationTimes=100)
    resultList$results[1:2,]

ASEB documentation built on Nov. 8, 2020, 5:07 p.m.