make.pwm: Construct position-weight matrix for subsets of reads from...

Description Usage Arguments Value Author(s) Examples

View source: R/viRome_functions.R View source: R/make.pwm.R

Description

This function takes a data frame as input, which should be identical to the output of read.bam and clip.bam. It filters according to the length of the mapped reads, and then countsthe occurrence of each base along the length of the read(s). The data are then optionally scaled.

Usage

1
make.pwm(vdf = NULL, minlen = 1, maxlen = 37, scaled = TRUE, strand = "pos", revcom = FALSE, ttou = FALSE)

Arguments

vdf

Data frame, should be the output of the output of clip.bam

minlen

The minimum length of mapped read to include

maxlen

The maximum length of mapped read to include

scaled

Whether or not to scale each column of base counts to the total number of bases in that column. Default: TRUE

strand

The strand to calculate the PWM on: either "pos" or "neg"

revcom

Whether or not to reverse-complement the sequence. We recommend you leave this as default. The default should work except those times when, for example, a negative strand virus has been published as a positive strand, and you have aligned your data to the positive strand.

ttou

Whether or not to convert T to U (Uracil). Currently not recommended. In all liklihood, you measured cDNA anyway, not RNA, and therefore you should report a T :)

Value

A matrix with the c("A","G","C","T") as rows, the position as columns and the (scaled) counts as values

Author(s)

Mick Watson

Examples

1
2
3
4
## Not run: infile <- system.file("examples/SRR389184_vs_SINV_sorted.bam", package="viRome")
 ## Not run: bam <- read.bam(bamfile=infile, chr="SINV", minlen=1, maxlen=11703, removeN=TRUE)
 ## Not run: bamc <- clip.bam(bam)
 ## Not run: make.pwm(bamc, minlen=25, maxlen=37)

mw55309/viRome_legacy documentation built on Dec. 21, 2021, 11:05 p.m.