seqToPSSM: Build a position-specific scoring matrix (PSSM) from a set of...

Description Usage Arguments Details Author(s) Examples

Description

Given a vector of sequences, built a position-specific scoring matrix (PSSM) with different derived statistics (counts, frequencies, probabilities, weights, information content).

Usage

1
2
seqToPSSM(sequences, prior = NULL, pseudo.count = 2, IC.log.base = 2,
  case.sensitive = FALSE)

Arguments

sequences

vector of strings corresponding to biological sequences (DNA, RNA, proteins)

prior=NULL

vector of residue prior probabilities (names must correspond to residues)

pseudo.count=2

pseudo-count

IC.log.base=2

Logarithmic base for the information content

case.sensitive=FALSE

by default residues are considered case-insensitive and converted to uppercases.

Details

First version: 2016-12-23 Last modification: 2016-12

Author(s)

Jacques van Helden (Jacques.van-Helden@univ-amu.fr)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## Define the sequences of yeast Met31p binding sites
sequences <- c(
  "MET28"="cgcccAAAACTGTGGtgttag",
  "MET3"="gttgtAAAACTGTGGCTTTGT",
  "MUP3"="cggaaAAAACTGTGGcgtcgc",
  "SAM1"="acaggAAAACTGTGGtggcgc",
  "SAM2"="gcttgAAAACTGTGGcgtttt",
  "MET6"="gtcgcAAAACTGTGGtagtca",
  "MET30"="ccgcgCAAACTGTGGcttccc",
  "ZWF1"="ataagCAAACTGTGGgttcat",
  "MET14"="cctcaAAAAATGTGGcaatgg",
  "MET17"="tcatgAAAACTGTGTaacata",
  "MET2"="tgcaaAAAATTGTGGatgcac",
  "MET8"="ggaaaAAAAATGTGAaaatcg",
  "MET1"="cataaTAAACTGTGAacggac")

## Chose priors based on yeast non-coding sequences
prior <- c("A"=0.32, "C"=0.18, "G"=0.18, "T"=0.32)

## Build the PSSM
pssm <- seqToPSSM(seq=sequences, prior = prior)

## Print count table
print(pssm$counts)

## Print weight matrix
signif(pssm$weights, digits=2)

## Plot a heatmap with the weights
heatmap.simple(pssm$counts, auto.margins=FALSE, xlab="Position", 
     ylab="Residues", main="Yeast Met13p count matrix", las=1)

jvanheld/stats4bioinfo documentation built on May 20, 2019, 5:16 a.m.