README.md

COSG in R

Accurate and fast cell marker gene identification with COSG

COSG is a cosine similarity-based method for more accurate and scalable marker gene identification.

The method and benchmarking results are described in Dai et al., (2022). The preprint is available in bioRxiv.

Here is the R version for COSG, and the python version is hosted in https://github.com/genecell/COSG.

Installation

# install.packages('remotes')
remotes::install_github(repo = 'genecell/COSGR')

Usage

Please check out the vignette and the PBMC10K tutorial to get started.

suppressMessages(library(Seurat))
data('pbmc_small',package='Seurat')
# Check cell groups:
table(Idents(pbmc_small))
#> 
#>  0  1  2 
#> 36 25 19 
#######
# Run COSG:
marker_cosg <- cosg(
 pbmc_small,
 groups='all',
 assay='RNA',
 slot='data',
 mu=1,
 n_genes_user=100)
#######
# Check the marker genes:
 head(marker_cosg$names)
#>       0      1     2
#> 1   CD7 S100A8 MS4A1
#> 2  CCL5   TYMP CD79A
#> 3  GNLY S100A9 TCL1A
#> 4 LAMP1  FCGRT  NT5C
#> 5  GZMA IFITM3 CD79B
#> 6   LCK   LST1 FCER2
 head(marker_cosg$scores)
#>           0         1         2
#> 1 0.6391917 0.8954042 0.6922908
#> 2 0.6391267 0.8312083 0.5832425
#> 3 0.6328148 0.8120045 0.5757478
#> 4 0.6164937 0.7755955 0.5533107
#> 5 0.5846589 0.7413060 0.5163446
#> 6 0.5795238 0.7380483 0.5115180
####### Run COSG for selected groups, i.e., '0' and 2':
#######
marker_cosg <- cosg(
 pbmc_small,
 groups=c('0', '2'),
 assay='RNA',
 slot='data',
 mu=1,
 n_genes_user=100)

Tip

  1. If you would like to identify more specific marker genes, you could assign mu to larger values, such as mu=10 or mu=100.
  2. You could set the parameter remove_lowly_expressed to TRUE to not consider genes expressed very lowly in the target cell group, and you can use the parameter expressed_pct to adjust the threshold for the percentage. For example:
marker_region<-cosg(
    seo,
  groups='all',
  assay='peaks',
  slot='data',
  mu=100,
  n_genes_user=100,
  remove_lowly_expressed=TRUE,
  expressed_pct=0.1
)

Citation

If COSG is useful for your research, please consider citing Dai, M., Pei, X., Wang, X.-J., 2022. Accurate and fast cell marker gene identification with COSG. Brief. Bioinform. bbab579.



genecell/COSGR documentation built on Jan. 3, 2023, 10:57 a.m.