Description Usage Arguments Value References
View source: R/reduce_dimensions.R
This function aims to both minimize batch effects and accentuate cell type differences in a single cell experiment. This function was implemented using Monocle3 but takes inspiration from the Granja et. al. reference cited below which took inspiration from the fly ATAC paper. At it's heart this function iterates through three main steps: 1) Using TFIDF transformation and SVD to normalize data 2) Clustering this normalized data using leiden clustering in high dimensional space and 3) identifying those features that are over-represented in the resulting clusters using a simple counting method. These three steps are repeated using features identified in step 3 to subset the normalization matrix in step 1 and repeating through the process. TFIDF transformation is supplied in this package. SVD is performed using the irilba package. Leiden clustering is performed using the monocle3 implementation and finally the counting per cluster is performed using the edgeR cpm function. This function takes as its input a cell_data_set and will iterate through n number of iterations. The output of this function is then appropriately input into dimensionality reduction methods such as UMAP or tSNE. The number of iterations is set by the number of resolution parameters specified.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | iterative_LSI(
cds,
num_dim = 25,
starting_features = NULL,
resolution = c(1e-04, 3e-04, 5e-04),
do_tf_idf = T,
num_features = c(3000, 3000, 3000),
exclude_features = NULL,
binarize = FALSE,
scale = T,
log_transform = T,
LSI_method = 1,
partition_qval = 0.05,
seed = 2020,
scale_to = 10000,
leiden_k = 20,
leiden_weight = FALSE,
leiden_iter = 1,
verbose = F,
return_iterations = F,
...
)
|
cds |
the cell_data_set upon which to perform this operation. |
num_dim |
Numeric indicating the number of prinicipal components to be in downstream ordering. Default value is NULL which will result in use of all PCs |
resolution |
vector of resolution values for leiden clustering |
num_features |
number of features to use for dimensionality reduction (default 3000). To use different numbers of features for different iterations, supply a vector that is the same length as the resolution vector. |
exclude_features |
character vector of features (rownames of assay(cds)) |
binarize |
boolean whether to binarize data prior to TFIDF transformation |
seed |
numeric seed |
scale_to |
numeric value to scale data |
return_iterations |
boolean whether to return iterations; funciton will then output a list contianing the final cds and all SVD matrices, clusters and features used in each iteration |
an updated cell_data_set object with a reduced dimension LSI object and clusters object
Granja, J. M.et al. (2019). Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nature Biotechnology, 37(12), 1458–1465.
UMAP: McInnes, L, Healy, J, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv e-prints 1802.03426, 2018
tSNE: Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE. J. Mach. Learn. Res., 9(Nov):2579– 2605, 2008.
Cusanovich, D. A., Reddington, J. P., Garfield, D. A., Daza, R. M., Aghamirzaie, D., Marco-Ferreres, R., et al. (2018). The cis-regulatory dynamics of embryonic development at single-cell resolution. Nature, 555(7697), 538–542.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.