clustify_lists: Main function to compare scRNA-seq data to gene lists.

Description Usage Arguments Value Examples

View source: R/main.R

Description

Main function to compare scRNA-seq data to gene lists.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
clustify_lists(input, ...)

## Default S3 method:
clustify_lists(
  input,
  marker,
  marker_inmatrix = TRUE,
  metadata = NULL,
  cluster_col = NULL,
  if_log = TRUE,
  per_cell = FALSE,
  topn = 800,
  cut = 0,
  genome_n = 30000,
  metric = "hyper",
  output_high = TRUE,
  lookuptable = NULL,
  obj_out = TRUE,
  seurat_out = TRUE,
  rename_prefix = NULL,
  threshold = 0,
  low_threshold_cell = 0,
  ...
)

## S3 method for class 'seurat'
clustify_lists(
  input,
  metadata = NULL,
  cluster_col = NULL,
  if_log = TRUE,
  per_cell = FALSE,
  topn = 800,
  cut = 0,
  marker,
  marker_inmatrix = TRUE,
  genome_n = 30000,
  metric = "hyper",
  output_high = TRUE,
  dr = "umap",
  seurat_out = TRUE,
  obj_out = TRUE,
  threshold = 0,
  rename_prefix = NULL,
  ...
)

## S3 method for class 'Seurat'
clustify_lists(
  input,
  metadata = NULL,
  cluster_col = NULL,
  if_log = TRUE,
  per_cell = FALSE,
  topn = 800,
  cut = 0,
  marker,
  marker_inmatrix = TRUE,
  genome_n = 30000,
  metric = "hyper",
  output_high = TRUE,
  dr = "umap",
  seurat_out = TRUE,
  obj_out = TRUE,
  threshold = 0,
  rename_prefix = NULL,
  ...
)

## S3 method for class 'SingleCellExperiment'
clustify_lists(
  input,
  metadata = NULL,
  cluster_col = NULL,
  if_log = TRUE,
  per_cell = FALSE,
  topn = 800,
  cut = 0,
  marker,
  marker_inmatrix = TRUE,
  genome_n = 30000,
  metric = "hyper",
  output_high = TRUE,
  dr = "umap",
  seurat_out = TRUE,
  obj_out = TRUE,
  threshold = 0,
  rename_prefix = NULL,
  ...
)

Arguments

input

single-cell expression matrix or Seurat object

...

passed to matrixize_markers

marker

matrix or dataframe of candidate genes for each cluster

marker_inmatrix

whether markers genes are already in preprocessed matrix form

metadata

cell cluster assignments, supplied as a vector or data.frame. If data.frame is supplied then cluster_col needs to be set. Not required if running correlation per cell.

cluster_col

column in metadata with cluster number

if_log

input data is natural log, averaging will be done on unlogged data

per_cell

compare per cell or per cluster

topn

number of top expressing genes to keep from input matrix

cut

expression cut off from input matrix

genome_n

number of genes in the genome

metric

adjusted p-value for hypergeometric test, or jaccard index

output_high

if true (by default to fit with rest of package), -log10 transform p-value

lookuptable

if not supplied, will look in built-in table for object parsing

obj_out

whether to output object instead of cor matrix

seurat_out

output cor matrix or called seurat object (deprecated, use obj_out instead)

rename_prefix

prefix to add to type and r column names

threshold

identity calling minimum correlation score threshold, only used when obj_out = T

low_threshold_cell

option to remove clusters with too few cells

dr

stored dimension reduction

Value

matrix of numeric values, clusters from input as row names, cell types from marker_mat as column names

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Annotate a matrix and metadata
clustify_lists(
    input = pbmc_matrix_small,
    marker = cbmc_m,
    metadata = pbmc_meta,
    cluster_col = "classified",
    verbose = TRUE
)

# Annotate using a different method
clustify_lists(
    input = pbmc_matrix_small,
    marker = cbmc_m,
    metadata = pbmc_meta,
    cluster_col = "classified",
    verbose = TRUE,
    metric = "jaccard"
)

Example output

                  CD4 T      CD8 T Memory CD4 T CD14+ Mono Naive CD4 T
Naive CD4 T  0.06713589 0.06713589            0   3.609737           0
Memory CD4 T 0.06713589 0.00000000            0   3.609737           0
CD14+ Mono   0.00000000 0.00000000            0   3.609737           0
B            0.00000000 0.00000000            0   3.609737           0
CD8 T        0.06713589 0.00000000            0   1.600060           0
FCGR3A+ Mono 0.00000000 0.00000000            0   3.609737           0
NK           0.00000000 0.06713589            0   3.609737           0
DC           0.00000000 0.00000000            0   3.609737           0
Platelet     0.12621349 0.00000000            0   3.523004           0
                     NK         B CD16+ Mono CD34+ Eryth       Mk        DC
Naive CD4 T  3.60973732 0.0000000 0.00000000     0     0 0.000000 0.0000000
Memory CD4 T 3.60973732 0.0000000 0.00000000     0     0 0.000000 0.0000000
CD14+ Mono   1.60005988 0.0000000 0.00000000     0     0 0.000000 0.0000000
B            1.60005988 1.6000599 0.00000000     0     0 0.000000 0.0000000
CD8 T        3.60973732 0.0000000 0.00000000     0     0 0.000000 0.0000000
FCGR3A+ Mono 0.02934733 0.0000000 0.02934733     0     0 0.000000 0.0000000
NK           3.60973732 0.0000000 0.00000000     0     0 0.000000 0.0000000
DC           1.60005988 0.1085286 0.00000000     0     0 0.000000 0.1085286
Platelet     1.58060198 1.5806020 0.00000000     0     0 3.523004 0.0000000
                pDCs
Naive CD4 T  0.00000
Memory CD4 T 0.00000
CD14+ Mono   0.00000
B            0.00000
CD8 T        0.00000
FCGR3A+ Mono 0.00000
NK           0.00000
DC           1.60006
Platelet     0.00000
                   CD4 T       CD8 T Memory CD4 T  CD14+ Mono Naive CD4 T
Naive CD4 T  0.001246883 0.001246883            0 0.003750000           0
Memory CD4 T 0.001246883 0.000000000            0 0.003750000           0
CD14+ Mono   0.000000000 0.000000000            0 0.003750000           0
B            0.000000000 0.000000000            0 0.003750000           0
CD8 T        0.001246883 0.000000000            0 0.002496879           0
FCGR3A+ Mono 0.000000000 0.000000000            0 0.003750000           0
NK           0.000000000 0.001246883            0 0.003750000           0
DC           0.000000000 0.000000000            0 0.003750000           0
Platelet     0.001166861 0.000000000            0 0.003508772           0
                      NK           B  CD16+ Mono CD34+ Eryth          Mk
Naive CD4 T  0.003750000 0.000000000 0.000000000     0     0 0.000000000
Memory CD4 T 0.003750000 0.000000000 0.000000000     0     0 0.000000000
CD14+ Mono   0.002496879 0.000000000 0.000000000     0     0 0.000000000
B            0.002496879 0.002496879 0.000000000     0     0 0.000000000
CD8 T        0.003750000 0.000000000 0.000000000     0     0 0.000000000
FCGR3A+ Mono 0.001246883 0.000000000 0.001246883     0     0 0.000000000
NK           0.003750000 0.000000000 0.000000000     0     0 0.000000000
DC           0.002496879 0.001246883 0.000000000     0     0 0.000000000
Platelet     0.002336449 0.002336449 0.000000000     0     0 0.003508772
                      DC        pDCs
Naive CD4 T  0.000000000 0.000000000
Memory CD4 T 0.000000000 0.000000000
CD14+ Mono   0.000000000 0.000000000
B            0.000000000 0.000000000
CD8 T        0.000000000 0.000000000
FCGR3A+ Mono 0.000000000 0.000000000
NK           0.000000000 0.000000000
DC           0.001246883 0.002496879
Platelet     0.000000000 0.000000000

clustifyr documentation built on Nov. 8, 2020, 5:32 p.m.