Description Usage Arguments Value References Examples
View source: R/refine_markers.R
This function refines a prior marker gene list by combining information from bulk tissue data, based on the penalized principal component analysis. The current implementation computes on one cell type at a time. To get marker genes for multiple cell types, call this function iteratively.
1 2 3 4 5 6 7 8 9 10 11 12 | refine_markers(
mat_exp,
range,
markers,
lambda,
w = 1.5,
thresh = 0.001,
alpha = 0.01,
maxit = 1000,
eps = 1e-04,
verbose = 0
)
|
mat_exp |
The gene expression matrix in the original scale (not logarithm-transformed), with rows standing for observations and columns for genes. The matrix should include gene names as column names. |
range |
A character vector of gene names, representing the range of genes in which markers are sought. |
markers |
A character vector of gene names giving the prior marker gene list. |
lambda |
A tuning parameter to control the number of selected marker genes. A larger value typically means a smaller number of genes. |
w |
Tuning parameter to control the weight on prior information. Larger w means genes not in the prior list are less likely to be selected as markers. |
thresh |
Below this threshold small factor loadings are treated as zeros. |
alpha |
Step size of the optimization algorithm. |
maxit |
Maximum number of iterations. |
eps |
Tolerance parameter for convergence. |
verbose |
Level of verbosity. |
A list containing the following components:
The sparse PCA result as in pca_pen()
.
A character vector of selected markers genes.
The estimated factor loadings for the associated genes.
Qiu, Y., Wang, J., Lei, J., & Roeder, K. (2020). Identification of cell-type-specific marker genes from co-expression patterns in tissue samples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | # Data used in the vignette
load(system.file("examples", "gene_expr.RData", package = "markerpen"))
load(system.file("examples", "published_markers.RData", package = "markerpen"))
load(system.file("examples", "markers_range.RData", package = "markerpen"))
# Get expression matrix - rows are observations, columns are genes
ind = match(rownames(dat), markerpen::gene_mapping$name)
ind = na.omit(ind)
ensembl = markerpen::gene_mapping$ensembl[ind]
mat_exp = t(dat[markerpen::gene_mapping$name[ind], ])
colnames(mat_exp) = ensembl
# We compute the marker genes for two cell types with a reduced problem size
# See the vignette for the full example
# Markers for astrocytes
set.seed(123)
search_range = intersect(markers_range$astrocytes, ensembl)
search_range = sample(search_range, 300)
prior_markers = intersect(pub_markers$astrocytes, search_range)
ast_re = refine_markers(
mat_exp, search_range, prior_markers,
lambda = 0.35, w = 1.5, maxit = 500, eps = 1e-3, verbose = 0
)
# Remove selected markers from the expression matrix
mat_rest = mat_exp[, setdiff(colnames(mat_exp), ast_re$markers)]
# Markers for microglia
search_range = intersect(markers_range$microglia, ensembl)
search_range = sample(search_range, 300)
prior_markers = intersect(pub_markers$microglia, search_range)
mic_re = refine_markers(
mat_exp, search_range, prior_markers,
lambda = 0.35, w = 1.5, maxit = 500, eps = 1e-3, verbose = 0
)
# Refined markers
markers_re = list(astrocytes = ast_re$markers,
microglia = mic_re$markers)
# Visualize the correlation matrix
cor_markers = cor(mat_exp[, unlist(markers_re)])
image(cor_markers, asp = 1)
# Post-process the selected markers
# Pick the first 20 ordered markers
markers_ord = sort_markers(cor_markers, markers_re)
markers_ord = lapply(markers_ord, head, n = 20)
# Visualize the correlation matrix
image(cor(mat_exp[, unlist(markers_ord)]), asp = 1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.