find_gene_modules: Cluster genes into modules that are co-expressed across...

View source: R/cluster_genes.R

find_gene_modulesR Documentation

Cluster genes into modules that are co-expressed across cells.

Description

Cluster genes into modules that are co-expressed across cells.

Usage

find_gene_modules(
  cds,
  reduction_method = c("UMAP"),
  max_components = 2,
  umap.metric = "cosine",
  umap.min_dist = 0.1,
  umap.n_neighbors = 15L,
  umap.fast_sgd = FALSE,
  umap.nn_method = "annoy",
  k = 20,
  leiden_iter = 1,
  partition_qval = 0.05,
  weight = FALSE,
  resolution = NULL,
  random_seed = 0L,
  cores = 1,
  verbose = FALSE,
  preprocess_method = c("PCA", "LSI"),
  nn_control = list(),
  ...
)

Arguments

cds

the cell_data_set upon which to perform this operation

reduction_method

The dimensionality reduction method used to generate the lower dimensional space in which genes will be clustered. Currently only UMAP is supported.

max_components

The number of dimensions in which to cluster genes into modules.

umap.metric

Metric used by UMAP for measuring similarity between genes .

umap.min_dist

Minimum distance parameter passed to UMAP.

umap.n_neighbors

Number of nearest neighbors used by UMAP.

umap.fast_sgd

Whether to allow UMAP to perform fast stochastic gradient descent. Defaults to TRUE. Setting FALSE will result in slower, but deterministic behavior (if cores=1).

umap.nn_method

The method used for nearest neighbor network construction during UMAP.

k

number of kNN used in creating the k nearest neighbor graph for Louvain clustering. The number of kNN is related to the resolution of the clustering result, bigger number of kNN gives low resolution and vice versa. Default to be 20

leiden_iter

Integer number of iterations used for Leiden clustering. The clustering result with the largest modularity score is used as the final clustering result. Default to be 1.

partition_qval

Significance threshold used in Louvain community graph partitioning.

weight

A logic argument to determine whether or not we will use Jaccard coefficient for two nearest neighbors (based on the overlapping of their kNN) as the weight used for Louvain clustering. Default to be FALSE.

resolution

Resolution parameter passed to Louvain. Can be a list. If so, this method will evaluate modularity at each resolution and use the one with the highest value.

random_seed

the seed used by the random number generator in Leiden.

cores

number of cores computer should use to execute function

verbose

Whether or not verbose output is printed.

preprocess_method

a string specifying the low-dimensional space to use for gene loadings, currently either PCA or LSI. Default is "PCA".

nn_control

An optional list of parameters used to make the nearest neighbor index. See the set_nn_control help for detailed information.

...

Additional arguments passed to UMAP and Louvain analysis.

Value

A dataframe with genes and the modules to which they are assigned.

Examples

  ## Not run: 
     expression_matrix <- readRDS(system.file('extdata',
                                              'worm_l2/worm_l2_expression_matrix.rds',
                                              package='monocle3'))
     cell_metadata <- readRDS(system.file('extdata',
                                          'worm_l2/worm_l2_coldata.rds',
                                          package='monocle3'))
     gene_metadata <- readRDS(system.file('extdata',
                                          'worm_l2/worm_l2_rowdata.rds',
                                          package='monocle3'))

     cds <- new_cell_data_set(expression_data=expression_matrix,
                              cell_metadata=cell_metadata,
                              gene_metadata=gene_metadata)

    cds <- preprocess_cds(cds, num_dim = 100)
    cds <- reduce_dimension(cds)
    cds <- cluster_cells(cds, resolution=1e-5)
    colData(cds)$assigned_cell_type <- as.character(partitions(cds))
    colData(cds)$assigned_cell_type <- dplyr::recode(colData(cds)$assigned_cell_type,
                                                    "1"="Germline",
                                                    "2"="Body wall muscle",
                                                    "3"="Unclassified neurons",
                                                    "4"="Vulval precursors",
                                                    "5"="Failed QC",
                                                    "6"="Seam cells",
                                                    "7"="Pharyngeal epithelia",
                                                    "8"="Coelomocytes",
                                                    "9"="Am/PH sheath cells",
                                                    "10"="Failed QC",
                                                    "11"="Touch receptor neurons",
                                                    "12"="Intestinal/rectal muscle",
                                                    "13"="Pharyngeal neurons",
                                                    "14"="NA",
                                                    "15"="flp-1(+) interneurons",
                                                    "16"="Canal associated neurons",
                                                    "17"="Ciliated sensory neurons",
                                                    "18"="Other interneurons",
                                                    "19"="Pharyngeal gland",
                                                    "20"="Failed QC",
                                                    "21"="Ciliated sensory neurons",
                                                    "22"="Oxygen sensory neurons",
                                                    "23"="Ciliated sensory neurons",
                                                    "24"="Ciliated sensory neurons",
                                                    "25"="Ciliated sensory neurons",
                                                    "26"="Ciliated sensory neurons",
                                                    "27"="Oxygen sensory neurons",
                                                    "28"="Ciliated sensory neurons",
                                                    "29"="Unclassified neurons",
                                                    "30"="Socket cells",
                                                    "31"="Failed QC",
                                                    "32"="Pharyngeal gland",
                                                    "33"="Ciliated sensory neurons",
                                                    "34"="Ciliated sensory neurons",
                                                    "35"="Ciliated sensory neurons",
                                                    "36"="Failed QC",
                                                    "37"="Ciliated sensory neurons",
                                                    "38"="Pharyngeal muscle")
    neurons_cds <- cds[,grepl("neurons", colData(cds)$assigned_cell_type, ignore.case=TRUE)]
    pr_graph_test_res <- graph_test(neurons_cds, neighbor_graph="knn")
    pr_deg_ids <- row.names(subset(pr_graph_test_res, q_value < 0.05))
    gene_module_df <- find_gene_modules(neurons_cds[pr_deg_ids,], resolution=1e-2)
  
## End(Not run)


cole-trapnell-lab/monocle3 documentation built on April 7, 2024, 9:24 p.m.