View source: R/cluster_genes.R
find_gene_modules | R Documentation |
Cluster genes into modules that are co-expressed across cells.
find_gene_modules(
cds,
reduction_method = c("UMAP"),
max_components = 2,
umap.metric = "cosine",
umap.min_dist = 0.1,
umap.n_neighbors = 15L,
umap.fast_sgd = FALSE,
umap.nn_method = "annoy",
k = 20,
leiden_iter = 1,
partition_qval = 0.05,
weight = FALSE,
resolution = NULL,
random_seed = 0L,
cores = 1,
verbose = FALSE,
preprocess_method = c("PCA", "LSI"),
nn_control = list(),
...
)
cds |
the cell_data_set upon which to perform this operation |
reduction_method |
The dimensionality reduction method used to generate the lower dimensional space in which genes will be clustered. Currently only UMAP is supported. |
max_components |
The number of dimensions in which to cluster genes into modules. |
umap.metric |
Metric used by UMAP for measuring similarity between genes . |
umap.min_dist |
Minimum distance parameter passed to UMAP. |
umap.n_neighbors |
Number of nearest neighbors used by UMAP. |
umap.fast_sgd |
Whether to allow UMAP to perform fast stochastic gradient descent. Defaults to TRUE. Setting FALSE will result in slower, but deterministic behavior (if cores=1). |
umap.nn_method |
The method used for nearest neighbor network construction during UMAP. |
k |
number of kNN used in creating the k nearest neighbor graph for Louvain clustering. The number of kNN is related to the resolution of the clustering result, bigger number of kNN gives low resolution and vice versa. Default to be 20 |
leiden_iter |
Integer number of iterations used for Leiden clustering. The clustering result with the largest modularity score is used as the final clustering result. Default to be 1. |
partition_qval |
Significance threshold used in Louvain community graph partitioning. |
weight |
A logic argument to determine whether or not we will use Jaccard coefficient for two nearest neighbors (based on the overlapping of their kNN) as the weight used for Louvain clustering. Default to be FALSE. |
resolution |
Resolution parameter passed to Louvain. Can be a list. If so, this method will evaluate modularity at each resolution and use the one with the highest value. |
random_seed |
the seed used by the random number generator in Leiden. |
cores |
number of cores computer should use to execute function |
verbose |
Whether or not verbose output is printed. |
preprocess_method |
a string specifying the low-dimensional space to use for gene loadings, currently either PCA or LSI. Default is "PCA". |
nn_control |
An optional list of parameters used to make the nearest neighbor index. See the set_nn_control help for detailed information. |
... |
Additional arguments passed to UMAP and Louvain analysis. |
A dataframe with genes and the modules to which they are assigned.
## Not run:
expression_matrix <- readRDS(system.file('extdata',
'worm_l2/worm_l2_expression_matrix.rds',
package='monocle3'))
cell_metadata <- readRDS(system.file('extdata',
'worm_l2/worm_l2_coldata.rds',
package='monocle3'))
gene_metadata <- readRDS(system.file('extdata',
'worm_l2/worm_l2_rowdata.rds',
package='monocle3'))
cds <- new_cell_data_set(expression_data=expression_matrix,
cell_metadata=cell_metadata,
gene_metadata=gene_metadata)
cds <- preprocess_cds(cds, num_dim = 100)
cds <- reduce_dimension(cds)
cds <- cluster_cells(cds, resolution=1e-5)
colData(cds)$assigned_cell_type <- as.character(partitions(cds))
colData(cds)$assigned_cell_type <- dplyr::recode(colData(cds)$assigned_cell_type,
"1"="Germline",
"2"="Body wall muscle",
"3"="Unclassified neurons",
"4"="Vulval precursors",
"5"="Failed QC",
"6"="Seam cells",
"7"="Pharyngeal epithelia",
"8"="Coelomocytes",
"9"="Am/PH sheath cells",
"10"="Failed QC",
"11"="Touch receptor neurons",
"12"="Intestinal/rectal muscle",
"13"="Pharyngeal neurons",
"14"="NA",
"15"="flp-1(+) interneurons",
"16"="Canal associated neurons",
"17"="Ciliated sensory neurons",
"18"="Other interneurons",
"19"="Pharyngeal gland",
"20"="Failed QC",
"21"="Ciliated sensory neurons",
"22"="Oxygen sensory neurons",
"23"="Ciliated sensory neurons",
"24"="Ciliated sensory neurons",
"25"="Ciliated sensory neurons",
"26"="Ciliated sensory neurons",
"27"="Oxygen sensory neurons",
"28"="Ciliated sensory neurons",
"29"="Unclassified neurons",
"30"="Socket cells",
"31"="Failed QC",
"32"="Pharyngeal gland",
"33"="Ciliated sensory neurons",
"34"="Ciliated sensory neurons",
"35"="Ciliated sensory neurons",
"36"="Failed QC",
"37"="Ciliated sensory neurons",
"38"="Pharyngeal muscle")
neurons_cds <- cds[,grepl("neurons", colData(cds)$assigned_cell_type, ignore.case=TRUE)]
pr_graph_test_res <- graph_test(neurons_cds, neighbor_graph="knn")
pr_deg_ids <- row.names(subset(pr_graph_test_res, q_value < 0.05))
gene_module_df <- find_gene_modules(neurons_cds[pr_deg_ids,], resolution=1e-2)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.