select_genes | R Documentation |
This function selects genes based on k-nearest neighbour analysis. The function takes a seurat object or gene expression matrix as input and compute distance to k-nearest neighbour for each gene/feature. A threshold is set based on permutation analysis and FDR computation.
select_genes(
data = NULL,
distance_method = c("pearson", "cosine", "euclidean", "spearman", "kendall"),
noise_level = 5e-05,
k = 80,
row_sum = 1,
fdr = 0.005,
which_slot = c("data", "sct", "counts"),
no_dknn_filter = FALSE,
no_anti_cor = FALSE,
seed = 123
)
data |
A matrix, data.frame or Seurat object. |
distance_method |
a character string indicating the method for computing distances (one of "pearson", "cosine", "euclidean", spearman or "kendall"). |
noise_level |
This parameter controls the fraction of genes with high dknn (ie. noise) whose neighborhood (i.e associated distances) will be used to compute simulated DKNN values. A value of 0 means to use all the genes. A value close to 1 means to use only gene with high dknn (i.e close to noise). |
k |
An integer specifying the size of the neighborhood. |
row_sum |
A feature/gene whose row sum is below this threshold will be discarded. Use -Inf to keep all genes. |
fdr |
A numeric value indicating the false discovery rate threshold (range: 0 to 100). |
which_slot |
a character string indicating which slot to use from the input scRNA-seq object (one of "data", "sct" or "counts"). |
no_dknn_filter |
a logical indicating whether to skip the k-nearest-neighbors (KNN) filter. If FALSE, all genes are kept for the next steps. |
no_anti_cor |
If TRUE, correlation below 0 are set to zero ("pearson", "cosine", "spearman" "kendall"). This may increase the relative weight of positive correlation (as true anti-correlation may be rare). |
seed |
An integer specifying the random seed to use. |
a ClusterSet class object
Julie Bavais, Sebastien Nin, Lionel Spinelli and Denis Puthier
- Lopez F.,Textoris J., Bergon A., Didier G., Remy E., Granjeaud S., Imbert J. , Nguyen C. and Puthier D. TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database. PLoSONE, 2008;3(12):e4001.
# Restrict vebosity to info messages only.
set_verbosity(1)
# Load a dataset
load_example_dataset("7871581/files/pbmc3k_medium")
# Select informative genes
res <- select_genes(pbmc3k_medium,
distance = "pearson",
row_sum=5)
# Result is a ClusterSet object
is(res)
slotNames(res)
# The selected genes
nrow(res)
head(row_names(res))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.