Description Usage Arguments Details Value Author(s) Examples
parseDB() parses a .fasta proteome database into a data table containing each protein accession ID in one column and the protein sequence in another column. Can also filter out reverse and contaminant sequences.
1 | parseDB(database, db_source, filt)
|
database |
a .fasta proteome database. |
db_source |
a string denoting the source of the input database. Key: "UP" - Uniprot; "RS" - RefSeq; "HM" - Homemade. |
filt |
a boolean variable (TRUE/FALSE) specifying whether or not to filter out reverse and contaminant sequences. |
File extension of database should be changed from ".fasta" to ".txt" prior to import into R. This function was built to organize .fasta databases for easier manipulation in R (such as prior to input into the phindPTMs() function) or other analysis software.
a data table with two columns: protein accession ID and protein sequence
Jacob M. Wozniak (jakewozniak@gmail.com)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # Locate example files
examples.path = system.file("/extdata", package = "PTMphinder")
uniprot_ref.path = paste(examples.path, "/Human_Uniprot_Example.txt", sep="")
# Read in proteome database example (or your own database)
uniprot_ref = readLines(uniprot_ref.path)
# Parse proteome database into 2 columns: protein accession and protein sequence (may take a while depending on database size)
parseDB_Example <- parseDB(uniprot_ref, "UP", FALSE)
# Create file name for parsed database
filename1 <- paste(examples.path, "/Human_Uniprot_Parsed_", Sys.Date(), ".txt", sep="")
# Write parsed database to new file (will be found in the PTMphinder/extdata/ directory within R.framework)
write.table(parseDB_Example, filename1, row.names=FALSE, col.names=FALSE, sep="\t")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.