trim_small_groups_and_low_expression_genes: trim_small_groups_and_low_expression_genes

Description Usage Arguments Details Value Examples

View source: R/loading_helper_functions.r

Description

Filter and return a SummarizedExperiment object (dataset_se) by several metrics:

Usage

1
2
3
trim_small_groups_and_low_expression_genes(dataset_se,
  min_lib_size = 1000, min_group_membership = 5,
  min_reads_in_sample = 1, min_detected_by_min_samples = 5)

Arguments

dataset_se

Summarised experiment object containing count data. Also requires 'ID' and 'group' to be set within the cell information (see colData())

min_lib_size

Minimum library size. Cells with fewer than this many reads removed. Default = 1000

min_group_membership

Throw out groups/clusters with fewer than this many cells. May change with experiment size. Default = 5

min_reads_in_sample

Require this many reads to consider a gene detected in a sample. Default = 1

min_detected_by_min_samples

Keep genes detected in this many samples. May change with experiment size. Default = 5

Details

If it hasn't been done already, it is highly reccomended to use this function to filter out genes with no/low total counts (especially in single cell data, there can be many) - without expression they are not useful and may reduce statistical power.

Likewise, very small groups (<5 cells) are unlikely to give useful results with this method. And cells with abnormally small library sizes may not be desireable.

Of course 'reasonable' thresholds for filtering cells/genes are subjective. Defaults are moderately sensible starting points.

Value

A filtered dataset_se, ready for use.

Examples

1
2
3
4
5
demo_query_se.trimmed  <- 
   trim_small_groups_and_low_expression_genes(demo_query_se)
demo_query_se.trimmed2 <- 
   trim_small_groups_and_low_expression_genes(demo_ref_se, 
                                              min_group_membership = 10)

celaref documentation built on Nov. 8, 2020, 5:03 p.m.