gene_search: gene_search
In MarioniLab/geneBasisR: What the Package Does (One Line, Title Case)

gene_search

R Documentation

gene_search

Description

Main function of the package - returns optimal library of the selected size.

Usage

gene_search(
  sce,
  genes_base = NULL,
  n_genes_total,
  batch = NULL,
  n.neigh = 5,
  p.minkowski = 3,
  nPC.selection = NULL,
  nPC.all = 50,
  genes.discard = NULL,
  genes.discard_prefix = NULL,
  verbose = TRUE,
  stat_all = NULL
)

Arguments

`sce`	SingleCellExperiment object containing gene counts matrix (stored in 'logcounts' assay).
`genes_base`	Character vector specifying base genes to construct first Selection graph. Default=NULL in case no genes are supplied.
`n_genes_total`	Scalar specifying total number of genes to be selected (this includes base genes).
`batch`	Name of the field in colData(sce) to specify batch. Default batch=NULL if no batch is applied.
`n.neigh`	Positive integer > 1, specifying number of neighbors to use for kNN-graph. Default n.neigh=5.
`p.minkowski`	Order of Minkowski distance. Default p.minkowski=3.
`nPC.selection`	Scalar specifying number of PCs to use for Selection Graphs. Default nPC=NULL. We advise to set it to 50 if `length(genes.selection) > 50`.
`nPC.all`	Scalar specifying number of PCs to use for True Graph. Default nPC.all=50.
`genes.discard`	Character vector containing genes to be excluded from candidates (note that they still will be used for graphs construction. If you want to exclude them from graph construction as well, just discard them prior in sce object). Default = NULL and no genes will be discarded.
`genes.discard_prefix`	Character vector containing prefixes of genes to be excluded (e.g. Rpl for L ribosomal proteins. Note that they still will be used for graphs construction. If you want to exclude them from graph construction as well, just discard them prior in sce object). Default = NULL and no genes will be discarded.
`verbose`	Boolean identifying whether intermediate print outputs should be returned. Default verbose=TRUE.
`stat_all`	If True graph and corresponding Minkowski distances have been calculated prior to search, provide this data here. It can be useful if gene_search is desired to be recycled (e.g. for selecting multiple libraries with different inputs such as n_genes_total and genes_base) Ensure that colnames = c("gene", "dist_all"). Default stat_all=NULL - in case this info is not supplied.

Value

data.frame containing selected genes and corresponding ranks. In case genes_base are supplied, rank among them will be assigned based on the order they are supplied in the corresponding string.

Examples

require(SingleCellExperiment)
n_row = 1000
n_col = 100
sce = SingleCellExperiment(assays = list(logcounts = matrix(rnorm(n_row*n_col), ncol=n_col)))
rownames(sce) = as.factor(1:n_row)
colnames(sce) = c(1:n_col)
sce$cell = colnames(sce)
genes = rownames(sce)
out = gene_search(sce, n_genes_total = 5)

MarioniLab/geneBasisR documentation built on June 30, 2023, 2:04 p.m.