Description Usage Arguments Details Author(s) Examples
A set of functions for extract features from biological sequences, and coding features by numeric vector.
1 | featurePSSM(seq, start.pos, stop.pos, psiblast.path, database.path)
|
seq |
a string vector for the protein, DNA, or RNA sequences. |
start.pos |
a integer vector denoting the start position of the fragment window. |
stop.pos |
a integer vector denoting the stop position of the fragment window. |
psiblast.path |
a string for the path of blastpgp program. blastpgp will be employed to do PSI-BLAST and get Position-Specific Scoring Matrix. |
database.path |
a string for the path of a formated reference database. Database can be formated by "formatdb" program. |
featurePSSM
returns a matrix with 20*N+N columns. Each row
represented features of one sequence coding by a 20*N+N dimension numeric
vector generated by PSI-BLAST. It contains two kinds of fatures: normalized
position-specific score of PSSM (Position-Specific Scoring Matrix), Shannon
entropy for each position of WOP (weighted observed percentages). Program
PSI-BLAST and formatted NCBI non-redundant protein database are needed.
Hong Li
1 2 3 4 5 6 7 8 | if(interactive()){
file = file.path(path.package("BioSeqClass"), "example", "acetylation_K.fasta")
tmp = readAAStringSet(file)
proteinSeq = as.character(tmp)
## Need "blastpgp" program and a formated database. Database can be formated by "formatdb" program.
PSSM1 = featurePSSM(proteinSeq[1:2], start.pos=rep(1,2), stop.pos=rep(10,2), psiblast.path="blastpgp", database.path="./result1.fasta")
}
|
Loading required package: scatterplot3d
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.