aggr_rep: Aggregate "Sample__Condition" Replicates in Data Matrix
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

aggr_rep

R Documentation

Aggregate "Sample__Condition" Replicates in Data Matrix

Description

This function aggregates "sample__condition" (see data argument) replicates by mean or median. The input data is either a data.frame or SummarizedExperiment.

Usage

aggr_rep(data, assay.na = NULL, sam.factor, con.factor = NULL, aggr = "mean")

Arguments

`data`	Terms spatial features: cells, tissues, organs, etc; variables: experimental variables such as drug dosage, temperature, time points, etc; biomolecules: genes, proteins, metabolites, etc; spatial heatmap: SHM. 'SummarizedExperiment' The `assays` slot stores the data matrix, where rows and columns are biomolecules and spatial featues respectively. Typically, at least two columns of spatial features and variables are stored in the `colData` slot respectively. When plotting SHMs, only identical spatial features between the data and aSVG will be colored according to the expression values of chosen biomolecules. Replicates of the same type in these two columns should be identical, e.g. "tissueA", "tissueA" rather than "tissueA1", "tissueA2". If column names in the `assays` slot follow the "spatialFeature__variable" scheme, i.e. spatial features and variables are concatenated by double underscore, then the `colData` slot is not required at all. If the data do not have experiment variables, the variable column in `colData` or the double underscore scheme is not required. 'data.frame' Rows and columns are biomolecules and spatial featues respectively. If there are experiment variables, the column names should follow the naming scheme "spatialFeature__variable". Otherwise, the column names should only include spatial features. The double underscore is a reserved string for specific purposes in `spatialHeatmap`, and thus should be avoided for naming spatial feature or variables. A column of biomolecule description can be included. This is only applicable in the interactive network graph (see `network`), where mousing over a node displays the corresponding description. vector In the function `shm`, the data can be provided in a numeric `vector` for testing with a single gene. If so, the naming schme of the vector is the same with the `data.frame`. Multiple variables For plotting SHMs, multiple variables contained in the data can be combined into a composite one, and the composite variable will be treated as a regular single variable. See the vignette for more details by running `browseVignettes('spatialHeatmap')` in R.
`assay.na`	The name of target assay to use when `data` is `SummarizedExperiment`.
`sam.factor`	The column name corresponding to spatial features in `colData` of `SummarizedExperiment`. If the column names in the `assay` slot already follows the scheme "spatialFeature__variable", then the `colData` slot is not required and accordingly this argument could be NULL.
`con.factor`	The column name corresponding to experimental variables in `colData` of `SummarizedExperiment`. It can be NULL if column names of in the `assay` slot already follows the scheme "spatialFeature__variable", or no variable is associated with the data.
`aggr`	Aggregate "sample__condition" replicates by "mean" or "median". The default is "mean". If the `data` argument is a `SummarizedExperiment`, the "sample__condition" replicates are internally formed by connecting samples and conditions with "__" in `colData` slot, and are subsequently replace the original column names in `assay` slot. If no condition specified to `con.factor`, the data are aggregated by sample replicates. If "none", no aggregation is applied.

Value

The returned value is the same class with the input data, a data.frame or SummarizedExperiment. In either case, the column names of the data matrix follows the "sample__condition" scheme.

Author(s)

Jianhai Zhang jzhan067@ucr.edu
Dr. Thomas Girke thomas.girke@ucr.edu

References

Morgan M, Obenchain V, Hester J, Pagès H (2022). SummarizedExperiment: SummarizedExperiment container. R package version 1.28.0, <https://bioconductor.org/packages/SummarizedExperiment>. R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ Keays, Maria. 2019. ExpressionAtlas: Download Datasets from EMBL-EBI Expression Atlas Love, Michael I., Wolfgang Huber, and Simon Anders. 2014. "Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2." Genome Biology 15 (12): 550. doi:10.1186/s13059-014-0550-8 McCarthy, Davis J., Chen, Yunshun, Smyth, and Gordon K. 2012. "Differential Expression Analysis of Multifactor RNA-Seq Experiments with Respect to Biological Variation." Nucleic Acids Research 40 (10): 4288–97 Cardoso-Moreira, Margarida, Jean Halbert, Delphine Valloton, Britta Velten, Chunyan Chen, Yi Shao, Angélica Liechti, et al. 2019. “Gene Expression Across Mammalian Organ Development.” Nature 571 (7766): 505–9 Amezquita R, Lun A, Becht E, Carey V, Carpp L, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pages H, Smith M, Huber W, Morgan M, Gottardo R, Hicks S (2020). “Orchestrating single-cell analysis with Bioconductor.” Nature Methods, 17, 137–145. https://www.nature.com/articles/s41592-019-0654-x

Examples


## Two example data sets are showcased for the data formats of "data.frame" and 
## "SummarizedExperiment" respectively. Both come from an RNA-seq analysis on 
## For conveninece, they are included in this package. The complete raw count data are
## downloaded using the R package ExpressionAtlas (Keays 2019) with the accession 
## number "E-MTAB-6769". 

# Access example data 1.
df.chk <- read.table(system.file('extdata/shinyApp/data/count_chicken_simple.txt', 
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t', check.names=FALSE)

# Column names follow the naming scheme
# "spatialFeature__variable".  
df.chk[1:3, ]

# A column of gene description can be optionally appended.
ann <- paste0('ann', seq_len(nrow(df.chk))); ann[1:3]
df.chk <- cbind(df.chk, ann=ann)
df.chk[1:3, ]

# Access example data 2. 
count.chk <- read.table(system.file('extdata/shinyApp/data/count_chicken.txt', 
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t')
count.chk[1:3, 1:5]

# A targets file describing spatial features and variables is required for example  
# data 2, which should be made based on the experiment design. 

# Access the targets file. 
target.chk <- read.table(system.file('extdata/shinyApp/data/target_chicken.txt', 
package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t')
# Every column in example data 2 corresponds with a row in the targets file. 
target.chk[1:5, ]
# Store example data 2 in "SummarizedExperiment".
library(SummarizedExperiment)
se.chk <- SummarizedExperiment(assay=count.chk, colData=target.chk)
# The "rowData" slot can optionally store a data frame of gene annotation.
rowData(se.chk) <- DataFrame(ann=ann)

# Normalize data.
df.chk.nor <- norm_data(data=df.chk, norm.fun='CNF', log2.trans=TRUE)
se.chk.nor <- norm_data(data=se.chk, norm.fun='CNF', log2.trans=TRUE)

# Aggregate replicates of "spatialFeature_variable", where spatial features are organs
# and variables are ages.
df.chk.aggr <- aggr_rep(data=df.chk.nor, aggr='mean')
df.chk.aggr[1:3, ]

se.chk.aggr <- aggr_rep(data=se.chk.nor, sam.factor='organism_part', con.factor='age',
aggr='mean')
assay(se.chk.aggr)[1:3, 1:3]

# Genes with experssion values >= 5 in at least 1% of all samples (pOA), and coefficient
# of variance (CV) between 0.2 and 100 are retained.
df.chk.fil <- filter_data(data=df.chk.aggr, pOA=c(0.01, 5), CV=c(0.2, 100))
se.chk.fil <- filter_data(data=se.chk.aggr, sam.factor='organism_part', con.factor='age', 
pOA=c(0.01, 5), CV=c(0.2, 100), file=NULL)

jianhaizhang/spatialHeatmap documentation built on Nov. 28, 2024, 4:44 p.m.

jianhaizhang/spatialHeatmap index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

aggr_rep: Aggregate "Sample__Condition" Replicates in Data Matrix
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Aggregate "Sample__Condition" Replicates in Data Matrix

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to aggr_rep in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

aggr_rep: Aggregate "Sample__Condition" Replicates in Data Matrix In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Aggregate "Sample__Condition" Replicates in Data Matrix

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to aggr_rep in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

aggr_rep: Aggregate "Sample__Condition" Replicates in Data Matrix
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions