se_detected_rows | R Documentation |
SummarizedExperiment heuristics to define detected rows
se_detected_rows(
se,
assay_name = 1,
group_colnames,
normgroup_colname = NULL,
detect_mincounts = 0,
detect_totalreps = 1,
detect_minreps = 2,
detect_minpct = 0.65,
detect_mingroups = 1,
isamples = colnames(se),
verbose = FALSE,
...
)
se |
|
assay_name |
|
group_colnames |
|
normgroup_colname |
|
detect_mincounts |
|
detect_totalreps |
|
detect_minreps |
|
detect_minpct |
|
detect_mingroups |
|
isamples |
|
verbose |
|
... |
additional arguments are ignored. |
This function is intended to help apply common logical rules to define valid, "detected" rows for downstream analysis.
The rules:
minimum value at or above which a measurement is "valid"
minimum total replicates with "valid" measurement, across all sample columns
minimum replicates with "valid" measurement required in any sample group
minimum percent replicates with "valid" measurement required in any sample group
minimum sample groups with "valid" criteria above required
Consider an experiment with 7 groups, and n=3 replicates, which contains 21 total samples.
Assume one row of data that contains 6 "valid" measurements.
If these 6 "valid" measurements are found in only 2 groups, both groups contain n=3 "valid" measurements. This row may have sufficient data for a statistical comparison across these two groups.
However, if the 6 "valid" measurements are also found across 6 different groups, it may not be suitable for statistical testing.
Detection can be carried out within "normgroups"
, which are
independent subsets of sample columns. In most cases this method
is not necessary, but is intended when the detected rows should
be independently calculated for two or more subsets of sample
columns.
A specific example might be an experiment that measures treatment
effects in two very different tissue types, like lung and muscle.
The detected genes in lung may well not be the same as detected
genes in lung. And in fact, statistical comparisons may not be intended
to compare muscle and lung directly. (That judgement is left
to the analyst.) One may define a column in colData(se)
that represents
tissue type, with values "muscle"
, and "lung"
, then define
this column with argument normgroup_colname
. The detection will
be done within each independent normgroup, returned as a list
named "detected_normgroup"
. The detected rows are also combined
into "detected_rows"
which returns rows detected across
all normgroups.
list
with the following elements:
detected_rows
is a character
vector of detected rownames(se)
detected_normgroup
is a list
of logical
vectors for each normgroup,
where the vectors encode whether a row is detected within each normgroup.
detected_df
is a data.frame
with summary information for each
normgroup.
Other jamses SE utilities:
make_se_test()
,
se_collapse_by_column()
,
se_collapse_by_row()
,
se_normalize()
,
se_rbind()
,
se_to_rowcoldata()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.