binarize.st | R Documentation |
Cluster the gene expression levels into highly-expressed and low-expressed groups.
binarize.st( count, gene.name, cluster.method = c("rank", "GMC", "k-means"), percentage.rank = 0.3 )
count |
An n-by-p numeric matrix Y that denotes the relative gene expression count table. Each entry is the relative count for gene j collected at spot i. |
gene.name |
A character string that specifies the gene in the expression count table to dichotomise. |
cluster.method |
An optional character string to specify the clustering technique. The default is "rank" for rank-based clustering. |
percentage.rank |
An optional numeric value to specify the cutoff for the rank-based clustering method. The default is 0.30 to set the cell counts above the 70% quantile as highly-expressed. |
After some filtering steps, a clustering method is applied to de-noise the expression levels of the given gene by dichotomising the spots into low-expressed (=0) and highly-expressed (=1) groups. This preprocessing step outputs the suitable data type required for some of the model fitting procedures, as well as making it more robust to over-dispersion and zero-inflation.
For the gene at a given spot, the expression level was modified to a binary format. Instead of a numeric value, a logical value is used to indicate the group affiliation. This method provides three clustering methods: (1) rank, a quantile-based approach which simply applies a cutoff at the specified percentage, (2) Gaussian mixture clustering (GMC), which is a model-based approach that fits a two-component Gaussian mixture model (GMM) with unequal variances, and (3) k-means (k-means), a distance-based approach which is implicitly based on the pairwise distances of the expression levels.
See Jiang et al. (2021) for more information on the filtering steps and last two clustering methods.
A binary and numeric vector to represent dichotomisation results. See "Details" for more information on how to interpret the entries.
Jiang, X., Li, Q., & Xiao, G. (2021). Bayesian Modeling of Spatial Transcriptomics Data via a Modified Ising Model. arXiv preprint arXiv:2104.13957.
st.plot()
for plotting the dichotomised expression levels.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.