filter_by_dot_prod: Filter cluster from a ClusterSet object using dot product.

View source: R/filter_gene_clusters.R

filter_by_dot_prodR Documentation

Filter cluster from a ClusterSet object using dot product.

Description

This function filters clusters of gene expression data based on their dot products. It aims to remove clusters that have a lot of zeros and are supported by only a few cells or spots. To do this, the function first converts the gene expression data for each cluster into a binary form (values greater than 1 are set to 1). Then it calculates the dot product for this binary matrix, which produces a gene-gene matrix showing the number of cells/spots where each pair of genes are expressed together. The function then calculates the median value of the maximum concordances across all genes, which can be used to determine whether a cluster should be filtered out or not.

Usage

filter_by_dot_prod(object = NULL, av_dot_prod_min = 2)

Arguments

object

A ClusterSet object.

av_dot_prod_min

Any cluster with average dot product below this value is discarded. This allow to delete clusters in which correlation is influenced/supported by very few samples (typically 1).

Examples

load_example_dataset("7871581/files/pbmc3k_medium_clusters")
pbmc3k_medium_clusters <- top_genes(pbmc3k_medium_clusters)
nclust(pbmc3k_medium_clusters)
obj <- filter_by_dot_prod(pbmc3k_medium_clusters, av_dot_prod_min=5)
nclust(obj) 

dputhier/dbfmcl documentation built on Dec. 20, 2024, 1:59 a.m.