parSeqSim: Parallellized Protein/DVA Sequence Similarity Calculation...

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Parallellized Protein/DNA Sequence Similarity Calculation based on Sequence Alignment

Usage

1
parSeqSim(protlist, type = "local", submat = "BLOSUM62")

Arguments

protlist

A length n list containing n protein sequences, each component of the list is a character string, storing one protein sequence. Unknown sequences should be represented as ''.

type

Type of alignment, default is 'local', could be 'global' or 'local', where 'global' represents Needleman-Wunsch global alignment; 'local' represents Smith-Waterman local alignment.

submat

Substitution matrix, default is 'BLOSUM62', could be one of 'BLOSUM45', 'BLOSUM50', 'BLOSUM62', 'BLOSUM80', 'BLOSUM100', 'PAM30', 'PAM40', 'PAM70', 'PAM120', 'PAM250'.

Details

This function implemented the parallellized version for calculating protein/DNA sequence similarity based on sequence alignment.

Value

A n x n similarity matrix.

Author(s)

Min-feng Zhu <wind2zhu@163.com>, Nan Xiao <http://nanx.me>

See Also

See twoSeqSim for protein sequence alignment for two protein/DNA sequences. See parGOSim for protein/DNA similarity calculation based on Gene Ontology (GO) semantic similarity.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Be careful when testing this since it involves parallelisation
# and might produce unpredictable results in some environments

require(Biostrings)

s1 = readFASTA(system.file('protseq/P00750.fasta', package = 'BioMedR'))[[1]]
s2 = readFASTA(system.file('protseq/P08218.fasta', package = 'BioMedR'))[[1]]
s3 = readFASTA(system.file('protseq/P10323.fasta', package = 'BioMedR'))[[1]]
s4 = readFASTA(system.file('protseq/P20160.fasta', package = 'BioMedR'))[[1]]
s5 = readFASTA(system.file('protseq/Q9NZP8.fasta', package = 'BioMedR'))[[1]]
plist = list(s1, s2, s3, s4, s5)
psimmat = parSeqSim(plist, type = 'local', submat = 'BLOSUM62')
print(psimmat)
s11 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[1]]
s21 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[2]]
s31 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[3]]
s41 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[4]]
s51 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[5]]
plist1 = list(s11, s21, s31, s41, s51)
psimmat1 = parSeqSim(plist1, type = 'local', submat = 'BLOSUM62')
print(psimmat1)

BioMedR documentation built on July 5, 2019, 9:03 a.m.