View source: R/class_RegspliceData.R
RegspliceData-class | R Documentation |
RegspliceData
objects contain data in the format required by functions in the
regsplice
analysis pipeline.
RegspliceData(counts, gene_IDs = NULL, n_exons = NULL, condition = NULL)
## S4 method for signature 'RegspliceData'
assays(x, withDimnames, ..., value)
countsData(x)
## S4 method for signature 'RegspliceData'
countsData(x)
weightsData(x)
## S4 method for signature 'RegspliceData'
weightsData(x)
## S4 method for signature 'RegspliceData'
rowData(x)
## S4 method for signature 'RegspliceData'
colData(x, ..., value)
## S4 method for signature 'RegspliceData,ANY,ANY,ANY'
x[i, j]
counts |
RNA-seq read counts or exon microarray intensities (matrix or data
frame). Rows are exons, and columns are biological samples. Alternatively,
|
gene_IDs |
Vector of gene IDs (character vector). Length is equal to the number of genes. |
n_exons |
Vector of exon lengths (numeric vector of integers), i.e. the number of exon bins per gene. Length is equal to the number of genes. |
condition |
Experimental condition for each biological sample (character or numeric vector, or factor). |
x |
|
withDimnames |
See |
... |
Additional arguments for replacement with |
value |
Value for replacement with |
i |
Gene names (character vector) or row numbers (numeric vector) for subsetting genes or exons. Note that when subsetting whole genes, gene names (character vector) should be provided instead of row numbers, to avoid possible errors due to selecting incorrect row numbers. Row numbers may be provided to subset individual exons. |
j |
Column numbers (numeric vector) for subsetting biological samples. |
The RegspliceData
format is based on the
SummarizedExperiment
container. Initially, objects
contain raw data along with meta-data for rows (genes and exons) and columns
(biological samples). During subsequent steps in the regsplice
analysis
pipeline, the data values are modified, and additional data and meta-data are added to
the object. Final results are stored in a RegspliceResults
object.
RegspliceData
objects are created with the constructor function
RegspliceData()
.
Required inputs for the constructor function are counts
(matrix or data frame
of RNA-seq read counts or exon microarray intensities), gene_IDs
(vector of
gene IDs), n_exons
(vector of exon lengths, i.e. number of exon bins per gene),
and condition
(vector of experimental conditions for each biological sample).
Alternatively, the inputs can be provided as a SummarizedExperiment
object,
which will be parsed to extract each of these components. This may be useful when
running regsplice
as part of a pipeline together with other packages.
See the vignette for an example showing how to construct gene_IDs
and
n_exons
from a column of gene:exon IDs.
Exon microarray intensities should be log2-transformed, which is usually done during
pre-processing of microarray data. (RNA-seq counts will be transformed automatically
during the regsplice
analysis pipeline; see runVoom
.)
After creating a RegspliceData
object, the wrapper function
regsplice
can be used to run the analysis pipeline with a single
command. Alternatively, you can run the individual functions for each step in the
pipeline, beginning with filterZeros
(see vignette for details).
Returns a RegspliceData
object.
counts
Matrix of RNA-seq read counts or exon microarray intensities. Rows are exons, and columns are biological samples.
weights
(Optional) Matrix of observation-level weights. Rows are exons, and
columns are biological samples. Created by the runVoom
function.
rowData
DataFrame
of row meta-data. This should contain two columns:
gene_IDs
and exon_IDs
, which are created by the RegspliceData
constructor function.
colData
DataFrame
of column meta-data. This contains the experimental
condition and (optionally) normalization factors for each biological sample.
Normalization factors are created by the runVoom
function.
countsData()
: Accesses the counts
data matrix.
weightsData()
: Accesses the (optional) weights
data matrix.
rowData()
: Accesses the DataFrame
of row meta-data. This should
contain two columns: gene_IDs
and exon_IDs
.
colData()
: Accesses the DataFrame
of column meta-data. This
contains the experimental condition and (optionally) normalization factors for each
biological sample.
Subsetting of RegspliceData
objects is performed with square brackets,
x[i, j]
, where x
is the name of the object. The subsetting operations
are designed to keep data and meta-data in sync.
For subsetting by rows, there are two possibilities:
Subsetting genes: To subset whole genes, provide a character vector of gene
names to the argument i
. The returned object will contain all rows
corresponding to these genes. Row numbers should not be used when subsetting whole
genes, since this risks potential errors due to selecting incorrect rows.
Subsetting exons: To subset individual exons, provide the corresponding row
numbers to the argument i
.
For subsetting by columns (biological samples), provide the corresponding column
numbers to the argument j
.
regsplice
filterZeros
# ---------
# Example 1
# ---------
counts <- matrix(sample(100:200, 14 * 6, replace = TRUE), nrow = 14, ncol = 6)
gene_IDs <- paste0("gene", 1:5)
n_exons <- c(3, 2, 3, 1, 5)
condition <- rep(c(0, 1), each = 3)
rs_data <- RegspliceData(counts, gene_IDs, n_exons, condition)
rs_data
countsData(rs_data)
rowData(rs_data)
colData(rs_data)
rs_data[1, ]
rs_data[1, 1:3]
rs_data["gene1", ]
rs_data["gene1", 1:3]
# --------------------
# Example 2 (Vignette)
# --------------------
file_counts <- system.file("extdata/vignette_counts.txt", package = "regsplice")
data <- read.table(file_counts, header = TRUE, sep = "\t", stringsAsFactors = FALSE)
head(data)
counts <- data[, 2:7]
tbl_exons <- table(sapply(strsplit(data$exon, ":"), function(s) s[[1]]))
gene_IDs <- names(tbl_exons)
n_exons <- unname(tbl_exons)
condition <- rep(c("untreated", "treated"), each = 3)
rs_data <- RegspliceData(counts, gene_IDs, n_exons, condition)
rs_data
head(countsData(rs_data))
rowData(rs_data)
colData(rs_data)
rs_data[1, ]
rs_data[1, 1:3]
rs_data["ENSG00000000003", ]
rs_data["ENSG00000000003", 1:3]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.