suppressPackageStartupMessages({ library("BiocStyle") library("RICdata") library(QFeatures) library(magrittr) library(tidyverse) })
RICdata
is a data package containing the data to analyse RNA interaction
capture from the manuscript titled: Global analysis of RNA-binding protein
dynamics by comparative and enhanced RNA interactome capture. This paper was
published in Nature Protocols 16,27–60(2021).
Peptide information extracted from PRIDE PXD009789.A total of 9 oligo DT capture and total cell lysate samples originated from mass spectrometry proteomics SILAC quantitative experiments [@Garcia-Moreno:2019].
# Path to tabular data WCLpeptidesfilepath<- system.file("extdata","WCL_peptides.txt", package = "RICdata") RICpeptidesfilepath<- system.file( "extdata", "RIC_peptides.txt", package = "RICdata") data("WCLpeptides.raw") dim(WCLpeptides.raw) data("RICpeptides.raw") dim(RICpeptides.raw)
Indices of the columns to be used as expression values are as follow:
j <- str_which(colnames(WCLpeptides.raw),str_c(c("Intensity.((\\D)).18_M_4", "Intensity.((\\D)).4_18_M", "Intensity.((\\D)).M_4_18"), collapse="|")) colnames(WCLpeptides.raw)[j] i <- str_which(colnames(RICpeptides.raw),str_c("Intensity.[H|M|L].", collapse="|")) colnames(RICpeptides.raw)[i]
We can convert tabular data into a QFeatures object:
QWCLpeptides <- readQFeatures(WCLpeptidesfilepath, ecol = j, sep = "\t", name = "peptides", fnames = "Sequence") QRICpeptides <- readQFeatures(RICpeptidesfilepath, ecol = i, sep = "\t", name = "peptides", fnames = "Sequence")
We can annotate with metadata our QFeatures objects. This is important as it defines the order and sample names of experiments.
sample_names=c('hour18','hour4','mock') QWCLpeptides$group <- paste(sample_names,rep(1:3,each=3),sep='_') QWCLpeptides$sample <- rep(1:3, each=3) colData(QWCLpeptides) QRICpeptides$group <- paste(sample_names,rep(1:3,each=3),sep='_') QRICpeptides$sample <- rep(1:3, each=3) colData(QRICpeptides)
We filter for contaminant proteins and decoy database hits which are indicated by "+" in the columns "Potential.contaminants" and "Reverse" respectively using QFeatures-filtering functions.
QWCLpeptidesfiltered <- QWCLpeptides %>% filterFeatures(~ Reverse == "") %>% filterFeatures(~ Potential.contaminant == "") QRICpeptidesfiltered <- QRICpeptides %>% filterFeatures(~ Reverse == "") %>% filterFeatures(~ Potential.contaminant == "")
We can retain only rowDatanames of interest. To do this we can use the
QFeatures::selectRowData
function.
rowDataNames(QWCLpeptidesfiltered)[["peptides"]] %>% length() rowDataNames(QRICpeptidesfiltered)[["peptides"]] %>% length() rowvars <- c("Sequence", "Proteins", "Leading.razor.protein") QWCLpeptidesfiltered_clean <- selectRowData(QWCLpeptidesfiltered, rowvars) QRICpeptidesfiltered_clean <- selectRowData(QRICpeptidesfiltered, rowvars) rowDataNames(QWCLpeptidesfiltered_clean)[["peptides"]] %>% length() rowDataNames(QRICpeptidesfiltered_clean)[["peptides"]] %>% length()
RICdata
package also contains a reduced version of data contained in ProtFeatures
[@Castello:2016]. This object is called miniProtFeatures
and contains proteins
sequence information.
miniProtFeature is a list with the following objects:
data(miniProtFeatures) head(miniProtFeatures$ProtSeq) head(miniProtFeatures$GeneName) head(miniProtFeatures$Symbol)
GO annotation provided in mRNAinteractome is included and called ENSG2category
.
data(ENSG2category) head(ENSG2category)
Index maps for all amino acids 4-mers to proteins is provided as Index
object,
and is used by the function mapPeptides
included in RIC` package to reverse
peptides on a protein sequence database.
data(Index) head(Index$AAAA)
All these data are required to run functions in RIC
package in order to
analyse RNA interaction capture data.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.