filter_by_read_count: filter_by_read_count
In russHyde/reeq: Provides Some Helper Functions For RNA-Seq Analysis / IO

filter_by_read_count

1
2
3

filter_by_read_count(dge, threshold = 0, count_type = c("counts",
  "cpm"), fraction_of_samples = 1, use_sample_average = FALSE,
  keep_lib_sizes = FALSE)

`dge`	An 'edgeR::DGEList'.
`threshold`	A single numeric value. This value decides which rows are kept in the 'DGEList' that is returned. The value of the other arguments dictates how this threshold is applied: if 'use_sample_average' is TRUE, rows that have an average value >= this threshold are kept; otherwise, rows that have a value >= this threshold in at least a fraction 'fraction_of_samples' of the columns are kept. If the remaining args are left at default values, only rows where every column has a value >= this threshold are kept. Default: 0.
`count_type`	A choice of 'counts' (the default) or 'cpm'. Should the current filter / threshold be applied to the raw counts or to the counts-per-million?
`fraction_of_samples`	What fraction of the input samples (ie, columns) should meet the 'threshold' for the filter to have passed for a given feature (ie, row)? Default: 1.
`use_sample_average`	Should rows be filtered based on whether the average of the value across a row is >= the 'threshold'? If this is 'TRUE' then the 'fraction_of_samples' value is ignored. Default: 'FALSE'.
`keep_lib_sizes`	Boolean. Indicates whether library-sizes should be recomputed for each sample after any features have been filtered out. Passed through to 'edgeR' subsetting function.