align_p_val: align_p_val

Description Usage Arguments Value Author(s) Examples

View source: R/align_p_val.R

Description

This function computes a p-value for the alignment of two sequences. Length(sequence1) must be <= length(sequence2) in the arguments.

Usage

1
align_p_val(seq_1, seq_2, scoringMat, gapOpen, gapExtend, B, type)

Arguments

seq_1

A sequence of DNA or amino acids. Ex: "ASEDLTI" or "GACT"

seq_2

A second sequence of DNA or amino acids. Ex: "AEDFGI" or "GACCT"

scoringMat

A scoring matrix for the alignment. See ?pairwiseAlignment in library Biostrings for details.

gapOpen

A numeric value for the penalty for opening a gap. See ?pairwiseAlignment in library Biostrings for details.

gapExtend

A numeric value for continuing a gap. See ?pairwiseAlignment in library Biostrings for details.

B

The number of bootstrap samples used in the p-value bootstrap calculation.

type

The text 'global' or 'local'. See ?pairwiseAlignment in library Biostrings for details.

Value

tt_0 The t-statistic for the original alignment.

tt_B A vector of t-statistics for B randomly generated alignments.

p_val The p-value, ie the proportion of randomization-based alignment scores that are great than or equal to the original observed alignment score. mean(tt_B >= tt_0).

Author(s)

Jennifer Starling

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
##EXAMPLE 1: AMINO ACID SEQUENCE ALIGNMENT
##Set up two sequences:
seq_1 <- "ASEDLTI"
seq_2 <- "AEEDFGI"
##Set up gap parameters:
gapOpen <- 0
gapExtend <- -2
##Set up scoring matrix: (Biostrings contains protein scoring matrices.  
##Can also manually set up your own scoring matrix; be sure to name rows and columns.)
source("http://bioconductor.org/biocLite.R")
biocLite("Biostrings")
library(Biostrings) 
data(PAM30) ##load PAM30 scoring matrix
myScoringMat <- "PAM30"
##Perform alignment and obtain p-value:
myAlignment <- pairwiseAlignment(seq_1, seq_2, substitutionMatrix = myScoringMat, 
 	gapOpening = gapOpen, gapExtension = gapExtend, type = "global", scoreOnly = FALSE)
pval_output <- align_p_val(seq_1, seq_2, myScoringMat, gapOpen, gapExtend, B = 1000, type='global')


##EXAMPLE 2: DNA SEQUENCE ALIGNMENT:
seq_1 <- "ACT"
seq_2 <- "GCAT"
myScoringMat <- matrix(c(3,-2,-2,-1,-2,2,0,-1,-2,0,2,-2,-1,-1,-2,1),nrow=4,byrow=T,
dimnames=list(c('A','C','T','G'),c('A','C','T','G')))
gapOpen <- 0
gapExtend <- -1
##Perform global alignment:
myAlignment <- pairwiseAlignment(seq_1, seq_2, substitutionMatrix = myScoringMat, 
 	gapOpening = gapOpen, gapExtension = gapExtend, type = "global", scoreOnly = FALSE)
pval_output <- align_p_val(seq_1, seq_2, myScoringMat, gapOpen, gapExtend, B = 1000, type='global')

jstarling1/starlib documentation built on May 20, 2019, 2:12 a.m.