aggr_rep | R Documentation |
This function aggregates "sample__condition" (see data
argument) replicates by mean or median. The input data is either a data.frame
or SummarizedExperiment
.
aggr_rep(data, assay.na = NULL, sam.factor, con.factor = NULL, aggr = "mean")
data |
|
assay.na |
The name of target assay to use when |
sam.factor |
The column name corresponding to spatial features in |
con.factor |
The column name corresponding to experimental variables in |
aggr |
Aggregate "sample__condition" replicates by "mean" or "median". The default is "mean". If the |
The returned value is the same class with the input data, a data.frame
or SummarizedExperiment
. In either case, the column names of the data matrix follows the "sample__condition" scheme.
Jianhai Zhang jzhan067@ucr.edu
Dr. Thomas Girke thomas.girke@ucr.edu
Morgan M, Obenchain V, Hester J, Pagès H (2022). SummarizedExperiment: SummarizedExperiment container. R package version 1.28.0, <https://bioconductor.org/packages/SummarizedExperiment>. R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ Keays, Maria. 2019. ExpressionAtlas: Download Datasets from EMBL-EBI Expression Atlas Love, Michael I., Wolfgang Huber, and Simon Anders. 2014. "Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2." Genome Biology 15 (12): 550. doi:10.1186/s13059-014-0550-8 McCarthy, Davis J., Chen, Yunshun, Smyth, and Gordon K. 2012. "Differential Expression Analysis of Multifactor RNA-Seq Experiments with Respect to Biological Variation." Nucleic Acids Research 40 (10): 4288–97 Cardoso-Moreira, Margarida, Jean Halbert, Delphine Valloton, Britta Velten, Chunyan Chen, Yi Shao, Angélica Liechti, et al. 2019. “Gene Expression Across Mammalian Organ Development.” Nature 571 (7766): 505–9 Amezquita R, Lun A, Becht E, Carey V, Carpp L, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pages H, Smith M, Huber W, Morgan M, Gottardo R, Hicks S (2020). “Orchestrating single-cell analysis with Bioconductor.” Nature Methods, 17, 137–145. https://www.nature.com/articles/s41592-019-0654-x
## Two example data sets are showcased for the data formats of "data.frame" and
## "SummarizedExperiment" respectively. Both come from an RNA-seq analysis on
## For conveninece, they are included in this package. The complete raw count data are
## downloaded using the R package ExpressionAtlas (Keays 2019) with the accession
## number "E-MTAB-6769".
# Access example data 1.
df.chk <- read.table(system.file('extdata/shinyApp/data/count_chicken_simple.txt',
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t', check.names=FALSE)
# Column names follow the naming scheme
# "spatialFeature__variable".
df.chk[1:3, ]
# A column of gene description can be optionally appended.
ann <- paste0('ann', seq_len(nrow(df.chk))); ann[1:3]
df.chk <- cbind(df.chk, ann=ann)
df.chk[1:3, ]
# Access example data 2.
count.chk <- read.table(system.file('extdata/shinyApp/data/count_chicken.txt',
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t')
count.chk[1:3, 1:5]
# A targets file describing spatial features and variables is required for example
# data 2, which should be made based on the experiment design.
# Access the targets file.
target.chk <- read.table(system.file('extdata/shinyApp/data/target_chicken.txt',
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t')
# Every column in example data 2 corresponds with a row in the targets file.
target.chk[1:5, ]
# Store example data 2 in "SummarizedExperiment".
library(SummarizedExperiment)
se.chk <- SummarizedExperiment(assay=count.chk, colData=target.chk)
# The "rowData" slot can optionally store a data frame of gene annotation.
rowData(se.chk) <- DataFrame(ann=ann)
# Normalize data.
df.chk.nor <- norm_data(data=df.chk, norm.fun='CNF', log2.trans=TRUE)
se.chk.nor <- norm_data(data=se.chk, norm.fun='CNF', log2.trans=TRUE)
# Aggregate replicates of "spatialFeature_variable", where spatial features are organs
# and variables are ages.
df.chk.aggr <- aggr_rep(data=df.chk.nor, aggr='mean')
df.chk.aggr[1:3, ]
se.chk.aggr <- aggr_rep(data=se.chk.nor, sam.factor='organism_part', con.factor='age',
aggr='mean')
assay(se.chk.aggr)[1:3, 1:3]
# Genes with experssion values >= 5 in at least 1% of all samples (pOA), and coefficient
# of variance (CV) between 0.2 and 100 are retained.
df.chk.fil <- filter_data(data=df.chk.aggr, pOA=c(0.01, 5), CV=c(0.2, 100))
se.chk.fil <- filter_data(data=se.chk.aggr, sam.factor='organism_part', con.factor='age',
pOA=c(0.01, 5), CV=c(0.2, 100), file=NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.