coclus_opt: Optimization of co-clustering bulk and single cell data
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

coclus_opt

R Documentation

Optimization of co-clustering bulk and single cell data

Description

This function is specialized in optimizing the co-clustering method that is able to automatically assign bulk tissues to single cells. A vignette is provide at https://jianhaizhang.github.io/spatialHeatmap_supplement/cocluster_optimize.html.

Usage

coclus_opt(
  dat.lis,
  df.para,
  df.fil.set,
  batch.par = NULL,
  multi.core.par = NULL,
  wk.dir,
  verbose = TRUE
)

Arguments

`dat.lis`	A two-level nested `list`. Each inner `list` consists of three slots of `bulk`, `cell`, and `df.match`, corresponding to bulk data, single cell data, and ground-truth matching between bulk and cells respectively. For example, list(dataset1=list(bulk=bulk.data1, cell=cell.data1, df.match=df.match1), dataset2=list(bulk=bulk.data2, cell=cell.data2, df.match=df.match2)).
`df.para`	A `data.frame` with each row corresponding to a combination of parameter settings in co-clustering.
`df.fil.set`	A `data.frame` of filtering settings. E.g. data.frame(p=c(0.1, 0.2), A=rep(1, 2), cv1=c(0.1, 0.2), cv2=rep(50, 2), cutoff=rep(1, 2), p.in.cell=c(0.15, 0.2), p.in.gen=c(0.05, 0.1), row.names=paste0('fil', seq_len(2))).
`batch.par`	The parameters for first-level parallelization through a cluster scheduler such as SLURM, which is `BatchtoolsParam`. If `NULL` (default), the first-level parallelization is skipped.
`multi.core.par`	The parameters for second-level parallelization, which is `MulticoreParam`.
`wk.dir`	The working directory, where results will be saved.
`verbose`	If `TRUE`, intermediate messages will be printed.

Value

A data.frame.

Author(s)

Jianhai Zhang jzhan067@ucr.edu
Dr. Thomas Girke thomas.girke@ucr.edu

References

Morgan M, Wang J, Obenchain V, Lang M, Thompson R, Turaga N (2022). _BiocParallel: Bioconductor facilities for parallel evaluation_. R package version 1.30.3, <https://github.com/Bioconductor/BiocParallel>. Li, Song, Masashi Yamada, Xinwei Han, Uwe Ohler, and Philip N Benfey. 2016. "High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincRNA Regulation." Dev. Cell 39 (4): 508–22 Shahan, Rachel, Che-Wei Hsu, Trevor M Nolan, Benjamin J Cole, Isaiah W Taylor, Anna Hendrika Cornelia Vlot, Philip N Benfey, and Uwe Ohler. 2020. "A Single Cell Arabidopsis Root Atlas Reveals Developmental Trajectories in Wild Type and Cell Identity Mutants." BioRxiv.

Examples


# Optimization includes many iterative runs of co-clustering. To reduce runtime, these runs 
# are parallelized with the package BiocParallel. 
library(BiocParallel)
# To obtain reproducible results, a fixed seed is set for generating random numbers.
set.seed(10)

# Read bulk (S. Li et al. 2016) and two single cell data sets (Shahan et al. 2020), all of
# which are from Arabidopsis root.
blk <- readRDS(system.file("extdata/cocluster/data", "bulk_cocluster.rds", 
package="spatialHeatmap")) # Bulk.
sc10 <- readRDS(system.file("extdata/cocluster/data", "sc10_cocluster.rds", 
package="spatialHeatmap")) # Single cell.
sc11 <- readRDS(system.file("extdata/cocluster/data", "sc11_cocluster.rds", 
package="spatialHeatmap")) # Single cell.
blk; sc10; sc11

# The ground-truth matching between bulk tissue and single cells needs to be defined in form 
# of a table so as to classify TRUE/FALSE assignments.
match.pa <- system.file("extdata/cocluster/data", "true_match_arab_root_cocluster.txt", 
package="spatialHeatmap")
df.match.arab <- read.table(match.pa, header=TRUE, row.names=1, sep='\t')
df.match.arab[1:3, ]

# Place the bulk, single cell data, and matching table in a list.
dat.lis <- list(
  dataset1=list(bulk=blk, cell=sc10, df.match=df.match.arab), 
  dataset1=list(bulk=blk, cell=sc11, df.match=df.match.arab) 
)

# Filtering settings. 
df.fil.set <- data.frame(p=c(0.1), A=rep(1, 1), cv1=c(0.1), cv2=rep(50, 1), cutoff=rep(1, 1),
p.in.cell=c(0.15), p.in.gen=c(0.05), row.names=paste0('fil', seq_len(1))) 
# Settings in pre-processing include normalization method (norm), filtering (fil). The 
# following optimization focuses on settings most relevant to co-clustering, including 
# dimension reduction methods (dimred), number of top dimensions for co-clustering (dims), 
# graph-building methods (graph), clustering methods (cluster). Explanations of these settings
# are provide in the help file of function "cocluster".  
norm <- c('FCT'); fil <- c('fil1'); dimred <- c('UMAP')
dims <- seq(5, 10, 1); graph <- c('knn', 'snn')
cluster <- c('wt', 'fg', 'le')

df.para <- expand.grid(dataset=names(dat.lis), norm=norm, fil=fil, dimred=dimred, dims=dims, 
graph=graph, cluster=cluster, stringsAsFactors = FALSE)


# Optimization is performed by calling "coclus_opt", and results to a temporary directory 
# "wk.dir".
wk.dir <- normalizePath(tempdir(check=TRUE), winslash="/", mustWork=FALSE)
df.res <- coclus_opt(dat.lis, df.para, df.fil.set, multi.core.par=MulticoreParam(workers=1, 
RNGseed=50), wk.dir=wk.dir, verbose=TRUE)
df.res[1:3, ]

jianhaizhang/spatialHeatmap documentation built on Nov. 28, 2024, 4:44 p.m.

jianhaizhang/spatialHeatmap index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

coclus_opt: Optimization of co-clustering bulk and single cell data
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Optimization of co-clustering bulk and single cell data

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to coclus_opt in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

coclus_opt: Optimization of co-clustering bulk and single cell data In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

Optimization of co-clustering bulk and single cell data

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to coclus_opt in jianhaizhang/spatialHeatmap...

R Package Documentation

Browse R Packages

We want your feedback!

jianhaizhang/spatialHeatmap
spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

coclus_opt: Optimization of co-clustering bulk and single cell data
In jianhaizhang/spatialHeatmap: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions