SpatialEnrichment: Identifying spatially enriched or depleted biomolecules
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

SpatialEnrichment

R Documentation

Identifying spatially enriched or depleted biomolecules

Description

The spatial enrichment (SpEn) is designed to detect spatially enriched or depleted biomolecules (genes, proteins, etc) for chosen spatial features (cellular compartments, tissues, organs, etc). It compares each feature with all other reference features. The biomolecules significantly up- or down-regulated in one feature relative to reference features are denoted spatially enriched or depleted respectively. The underlying differential expression analysis methods include edgeR (Robinson et al, 2010), limma (Ritchie et al, 2015), and DESeq2 (Love et al, 2014). By querying a feature of interest from the enrichment results, the enriched or depleted biomolecules will be returned.
In addition, the SpEn is also able to identify biomolecules enriched or depleted in experiment vairables in a similar manner.

'sf_var()' subsets data according to given spatial features and variables.

'spatial_enrich()' detects enriched or depleted biomolecules for each given spatial feature.

'query_enrich()' queries enriched or depleted biomolecules in the enrichment results returned by spatial_enrich for a chosen spatial feature.

'ovl_enrich()' plots overlap of enrichment results across spatial features in form an upset plot, overlap matrix, or Venn diagram.

'graph_line()' plots expression values of chosen biomolecules in a line graph.

Usage

sf_var(
  data,
  feature,
  ft.sel = NULL,
  variable = NULL,
  var.sel = NULL,
  com.by = "ft"
)

spatial_enrich(
  data,
  method = c("edgeR"),
  norm = "TMM",
  m.array = FALSE,
  pairwise = FALSE,
  log2.fc = 1,
  p.adjust = "BH",
  fdr = 0.05,
  outliers = 0,
  aggr = "mean",
  log2.trans = TRUE,
  verbose = TRUE
)

query_enrich(res, query, other = FALSE, data.rep = FALSE)

ovl_enrich(
  res,
  type = "up",
  plot = "matrix",
  order.by = "freq",
  nintersects = 40,
  point.size = 3,
  line.size = 1,
  mb.ratio = c(0.6, 0.4),
  text.scale = 1.5,
  upset.arg = list(),
  show.plot = TRUE,
  venn.arg = list(),
  axis.agl = 45,
  font.size = 5,
  cols = c("lightcyan3", "darkorange")
)

graph_line(
  data,
  scale = "none",
  x.title = "Samples",
  y.title = "Assay values",
  linewidth = 1,
  text.size = 15,
  text.angle = 60,
  lgd.pos = "right",
  lgd.guide = guides(color = guide_legend(nrow = 1, byrow = TRUE, title = NULL))
)

Arguments

`data`	`sf_var` A `SummarizedExperiment` object. The `colData` slot is required to contain at least two columns of spatial features and experiment variables respectively. `spatial_enrich` A `SummarizedExperiment` object returned by `sf_var`. `graph_line` A `data.frame`, where rows are biomolecules and columns are spatial features.
`feature`	The column name in the `colData` slot of `SummarizedExperiment` that contains spatial features.
`ft.sel`	A vector of spatial features to choose.
`variable`	The column name in the `colData` slot of `SummarizedExperiment` that contains experiment variables.
`var.sel`	A vector of variables to choose.
`com.by`	One of `ft`, `var`, or `ft.var`. If `ft`, the enrichment is performed for each spatial feature and the variables are treated as replicates. If `var` the enrichment is performed for each variable and spatial features are treated as replicates. If `ft.var`, spatial features (tissue1, tissue2) and variables (var1, var2) are combined such as tissue1__var1, tissue1_var2, tissue2__var1, tissue2_var2. The enrichment is performed for each combination.
`method`	One of `edgeR`, `limma`, and `DESeq2`.
`norm`	The normalization method (`TMM`, `RLE`, `upperquartile`, `none`) in edgeR. The default is `TMM`. Details: https://www.rdocumentation.org/packages/edgeR/versions/3.14.0/topics/calcNormFactors.
`m.array`	Logical. 'TRUE' and 'FALSE' indicate the input are microarray and count data respectively.
`pairwise`	Logical. If 'TRUE', pairwise comparisons will be performed starting dispersion estimation. If 'FALSE' (default), all samples are fitted into a GLM model together, then pairwise comparisons are performed through contrasts.
`log2.fc`	The log2-fold change cutoff. The default is 1.
`p.adjust`	The method (`holm`, `hochberg`, `hommel`, `bonferroni`, `BH`, `BY`, `fdr`, or `none`) for adjusting p values in multiple hypothesis testing. The default is `BH`.
`fdr`	The FDR cutoff. The default is 0.05.
`outliers`	The number of outliers allowed in the references. If there are too many references, there might be no enriched/depleted biomolecules in the query feature. To avoid this, set a certain number of outliers.
`aggr`	One of `mean` (default) or `median`. The method to aggregated replicates in the assay data.
`log2.trans`	Logical. If `TRUE` (default), the aggregated data (see `aggr`) is transformed to log2-scale and will be further used for plotting SHMs.
`verbose`	Logical. If 'TRUE' (default), intermmediate messages will be printed.
`res`	Enrichment results returned by `spatial_enrich`.
`query`	A spatial feature for query.
`other`	Logical (default is 'FALSE'). If 'TRUE' other genes that are neither enriched or depleted will also be returned.
`data.rep`	Logical. If 'TRUE' normalized data before aggregating replicates will be returned. If 'FALSE', normalized data after aggretating replicates will be returned.
`type`	One of `up` (default) or `down`, which refers to up- or down-regulated biomolecules.
`plot`	One of `upset`, `matrix`, or `venn`, corresponding to upset plot, overlap matrix, or Venn diagram respectively.
`order.by`	How the intersections in the matrix should be ordered by. Options include frequency (entered as "freq"), degree, or both in any order.
`nintersects`	Number of intersections to plot. If set to NA, all intersections will be plotted.
`point.size`	Size of points in matrix plot
`line.size`	The line thickness in overlap matrix.
`mb.ratio`	Ratio between matrix plot and main bar plot (Keep in terms of hundredths)
`text.scale`	Numeric, value to scale the text sizes, applies to all axis labels, tick labels, and numbers above bar plot. Can be a universal scale, or a vector containing individual scales in the following format: c(intersection size title, intersection size tick labels, set size title, set size tick labels, set names, numbers above bars)
`upset.arg`	A `list` of additional arguments passed to `upset`.
`show.plot`	Logical flag indicating whether the plot should be displayed. If false, simply returns the group count matrix.
`venn.arg`	A `list` of additional arguments passed to `venn`.
`axis.agl`	The angle of axis text in overlap matrix.
`font.size`	The font size of all text in overlap matrix.
`cols`	A vector of two colors indicating low and high values in the overlap matrix respectively. The default is `c("lightcyan3", "darkorange")`.
`scale`	The method to scale the data. If `none` (default), no scaling. If `row`, each row is scaled independently. If `all`, all rows are scaled as a whole.
`x.title`, `y.title`	The title of X-axis and Y-axis respectively.
`linewidth`	The line width.
`text.size`	The font size of all text.
`text.angle`	The angle of axis text.
`lgd.pos`	The position of legend. The default is `right`.
`lgd.guide`	The `guides` function in `ggplot2` for customizing legends.

Value

'sf_var': A SummarizedExperiment object.
'spatial_enrich': A list object.
'query_enrich': A SummarizedExperiment object.
'ovl_enrich': An UpSet plot, overlap matrix plot, or Venn diagram.
'graph_line': A ggplot.

Author(s)

Jianhai Zhang jzhan067@ucr.edu
Dr. Thomas Girke thomas.girke@ucr.edu

References

Cardoso-Moreira, Margarida, Jean Halbert, Delphine Valloton, Britta Velten, Chunyan Chen, Yi Shao, Angélica Liechti, et al. 2019. “Gene Expression Across Mammalian Organ Development.” Nature 571 (7766): 505–9 Keays, Maria. 2019. ExpressionAtlas: Download Datasets from EMBL-EBI Expression Atlas Martin Morgan, Valerie Obenchain, Jim Hester and Hervé Pagès (2018). SummarizedExperiment: SummarizedExperiment container. R package version 1.10.1 Robinson MD, McCarthy DJ and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140 Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and Smyth, G.K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research 43(7), e47. Love, M.I., Huber, W., Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biology 15(12):550 (2014) Nils Gehlenborg (2019). UpSetR: A More Scalable Alternative to Venn and Euler Diagrams for Visualizing Intersecting Sets. R package version 1.4.0. https://CRAN.R-project.org/package=UpSetR H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.

Examples


## In the following examples, the toy data come from an RNA-seq analysis on development of 7
## chicken organs under 9 time points (Cardoso-Moreira et al. 2019). For conveninece, it is
## included in this package. The complete raw count data are downloaded using the R package
## ExpressionAtlas (Keays 2019) with the accession number "E-MTAB-6769".   

library(SummarizedExperiment) 
# Access the count table. 
cnt.chk <- read.table(system.file('extdata/shinyApp/data/count_chicken.txt', package='spatialHeatmap'), header=TRUE, row.names=1,sep='\t')
cnt.chk[1:3, 1:5]
# A targets file describing spatial features and conditions is required for toy data. It should be made
# based on the experiment design, which is accessible through the accession number 
# "E-MTAB-6769" in the R package ExpressionAtlas. An example targets file is included in this
# package and accessed below. 

# Access the example targets file. 
tar.chk <- read.table(system.file('extdata/shinyApp/data/target_chicken.txt', package='spatialHeatmap'), header=TRUE, row.names=1, sep='\t') 
# Every column in count table corresponds with a row in targets file. 
tar.chk[1:5, ]
# Store count data and targets file in "SummarizedExperiment".
se.chk <- SummarizedExperiment(assay=cnt.chk, colData=tar.chk)
# The "rowData" slot can store a data frame of gene metadata, but not required. Only the 
# column named "metadata" will be recognized. 
# Pseudo row metadata.
metadata <- paste0('meta', seq_len(nrow(cnt.chk))); metadata[1:3]
rowData(se.chk) <- DataFrame(metadata=metadata)

# Subset the count data by features (brain, heart, kidney) and variables (day10, day12).
# By setting com.by='ft', the subsequent spatial enrichment will be performed across 
# features with the variables as replicates. 
data.sub <- sf_var(data=se.chk, feature='organism_part', ft.sel=c('brain', 'kidney',
 'heart', 'liver'), variable='age', var.sel=c('day10', 'day35'), com.by='ft')

## As conventions, raw sequencing count data should be normalized and filtered to
## reduce noise. Since normalization will be performed in spatial enrichment, only filtering
## is required.  

# Filter out genes with low counts and low variance. Genes with counts over 5 in
# at least 10% samples (pOA), and coefficient of variance (CV) between 3.5 and 100 are 
# retained.
data.sub.fil <- filter_data(data=data.sub, sam.factor='organism_part', con.factor='age',
pOA=c(0.1, 5), CV=c(0.7, 100))

# Spatial enrichment for every spatial feature with 1 outlier allowed.  
enr.res <- spatial_enrich(data.sub.fil, method=c('edgeR'), norm='TMM', log2.fc=1, fdr=0.05, outliers=1)
# Overlaps of enriched genes across features.
ovl_enrich(enr.res, type='up', plot='upset')
# Query the results for brain.
en.brain <- query_enrich(enr.res, 'brain')
rowData(en.brain)[1:3, c('type', 'total', 'method')] 

# Read aSVG image into an "SVG" object.
svg.chk <- system.file("extdata/shinyApp/data", "gallus_gallus.svg", 
package="spatialHeatmap")
svg.chk <- read_svg(svg.chk)
# Plot an enrichment SHM.
dat.enrich <- SPHM(svg=svg.chk, bulk=en.brain)
shm(data=dat.enrich, ID=rownames(en.brain)[1], legend.r=1, legend.nrow=7, sub.title.size=10, ncol=2, bar.width=0.09, lay.shm='gene')
# Line graph of gene expression profile.
graph_line(assay(en.brain[1, , drop=FALSE]), lgd.pos='bottom')

jianhaizhang/spatialHeatmap documentation built on Nov. 28, 2024, 4:44 p.m.

jianhaizhang/spatialHeatmap index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

SpatialEnrichment: Identifying spatially enriched or depleted biomolecules
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Identifying spatially enriched or depleted biomolecules

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to SpatialEnrichment in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

SpatialEnrichment: Identifying spatially enriched or depleted biomolecules In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Identifying spatially enriched or depleted biomolecules

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to SpatialEnrichment in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

SpatialEnrichment: Identifying spatially enriched or depleted biomolecules
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions