This vignette describes an analysis of Repeat Induced Point (RIP)
mutations in R using the ripr
package. ripr
contains functionality
for parsing RepeatMasker output, calculating RIP scores, and plotting
scores along chromosomes in Manhattan-like plots. Before we start with
the example analysis, we describe how ripr
represents RepeatMasker
output.
library(knitr) knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE, fig.width=8, fig.height=6, autodep=TRUE, cache=FALSE, include=TRUE, eval=TRUE, tidy=FALSE, dev=c('png')) knitr::knit_hooks$set(inline = function(x) { prettyNum(x, big.mark=" ") })
library(ripr) library(ggplot2) library(plyr) library(dplyr) library(cowplot) library(Biostrings) library(GenomeInfoDb) library(GenomicRanges)
library(viridis) library(RColorBrewer) bw <- theme_bw(base_size=18) %+replace% theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1)) theme_set(bw) color.pal.4 <- brewer.pal(name = "Paired", n = 4)
RepeatMasker screens sequences for repeats and low-complexity regions.
ripr
contains functions to parse two of RepeatMasker's output files,
namely the annotation and alignment results. These files pair
coordinates in the input sequence that is scanned for repeats (here,
the genome sequence) with coordinates in the repeat sequence that is
defined in the RepeatMasker library. The input sequence will
henceforth be referred to as the query and the repeat sequence the
subject.
ripr
stores the results from RepeatMasker output as an
AlignmentPairs
object, which is a subclass of the Bioconductor class
S4Vectors::Pairs
. A Pairs
object aligns two vectors along slot
names first
and second
, and the AlignmentPairs
object adds extra
slots related to RepeatMasker output. In the following sections we
parse RepeatMasker output and explain the additional slot names.
For more information about RepeatMasker and its output formats, see [@repeatmasker_docs].
The RepeatMasker annotation output is a simple tabular format where each line consists of a query-subject pair and additional statistics such as score and percentage divergence:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.