OnePassProfilingMat: Provides oligotyping based on a one-pass entropy profiling,...

Description Usage Arguments Value Author(s) Examples

Description

The function selects only the positions for which entropy is larger than a chosen cutoff. The positions are then aggregated to provide an oligotype identity. The function works on a alignment of sequences provided as a matrix object (see example).

Usage

1
OnePassProfilingMat(AlignedSequences = Sequences, minseq = 21, entropymin = 0.6, Plot = TRUE)

Arguments

AlignedSequences

matrix. SequenceId-by-position matrix as produced by e.g. ImportFastaAlignmentImportFastaAlignment(). This is the main difference to MED(), the latter working on files and not on object in the current workspace.

minseq

numeric. minimum number of sequences before the procedure stops for a specific subalignment.

entropymin

numeric. minimum entropy level before the procedure stops for a specific subalignment.

Plot

logical. Plots the entropy profile, the base composition for the identified high entropy positions, and the histogram of relative abundance of concatenated oligotypes.:wq

Value

A list with three slots:

OT.seq.concat

A vector of concatenated positions for each sequence.

OT.count

A summary table of overall abundance for each oligotype.

OT.freq

A summary table of overall frequency for each oligotype.

Author(s)

Alban Ramette

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
File="HGB_0013_GXJPMPL01A3OQX.fasta"
Aln.list<- ImportFastaAlignment(File) #path to FASTA file
Names <-  Aln.list[[1]]
Sequences <- toupper(Aln.list[[2]])# do not trim trailing dots at 5' and 3' ends
OnePass <- OnePassProfilingMat( 
  AlignedSequences=Sequences,
  minseq=21,
  entropymin=0.6,#arbitrary cutoff. the whole thing can be set in a function with a parameter specifying this cutoff.
  Plot=TRUE
)
#Position: 185 
#          A      G
#Nber 193.00 982.00
#Prop   0.17   0.87
#
#Position: 241 
#         A C G      U
#Nber 568.0 3 5 599.00
#Prop   0.5 0 0   0.53
#
#Position: 242 
#          - A C      G     U
#Nber 355.00 4 4 247.00 565.0
#Prop   0.31 0 0   0.22   0.5
#
#Position: 271 
#          - A      G
#Nber 818.00 2 355.00
#Prop   0.72 0   0.31
#
#Position: 272 
#          C G      U
#Nber 981.00 1 193.00
#Prop   0.87 0   0.17
#

str(OnePass)
#List of 3
# $ OT.seq.concat: Named chr [1:1175] "GAU-C" "GAU-C" "AUG-U" "AUG-U" ...
#  ..- attr(*, "names")= chr [1:1175] "1" "2" "3" "4" ...
# $ OT.count     : 'table' int [1:17(1d)] 1 1 1 1 1 188 4 562 1 2 ...
#  ..- attr(*, "dimnames")=List of 1
#  .. ..$ OT.seq.concat: chr [1:17] "AAU-C" "ACG-U" "AU--G" "AU-GC" ...
# $ OT.freq      : table [1:17(1d)] 0.000851 0.000851 0.000851 0.000851 0.000851 ...
#  ..- attr(*, "dimnames")=List of 1
#  .. ..$ OT.seq.concat: chr [1:17] "AAU-C" "ACG-U" "AU--G" "AU-GC" ...
 

aramette/otu2ot documentation built on May 10, 2019, 12:46 p.m.