View source: R/collapseMutantsByAA.R
collapseMutantsBySimilarity | R Documentation |
These functions can be used to collapse variants, either by similarity or
according to a pre-defined grouping. The functions collapseMutants
and
collapseMutantsByAA
assume that a grouping variable is available as a
column in rowData(se)
(collapseMutantsByAA
is a convenience
function for the case when this column is "mutantNameAA", and is provided
for backwards compatibility). The collapseMutantsBySimilarity
will
generate the grouping variable based on user-provided thresholds on the
sequence similarity (defined by the Hamming distance), and subsequently
collapse based on the derived grouping.
collapseMutantsBySimilarity(
se,
assayName,
scoreMethod = "rowSum",
sequenceCol = "sequence",
collapseMaxDist = 0,
collapseMinScore = 0,
collapseMinRatio = 0,
verbose = TRUE
)
collapseMutantsByAA(se)
collapseMutants(se, nameCol)
se |
A |
assayName |
The name of the assay that will be used to calculate a "score" (typically derived from the read counts) for each variant. |
scoreMethod |
Character scalar giving the approach used to calculate
ranking scores from the assay defined by |
sequenceCol |
Character scalar giving the name of the column in
|
collapseMaxDist |
Numeric scalar defining the tolerance for collapsing
similar sequences. If the value is in [0, 1), it defines the maximal
Hamming distance in terms of a fraction of sequence length:
( |
collapseMinScore |
Numeric scalar, indicating the minimum score for the sequence to be considered for collapsing with similar sequences. |
collapseMinRatio |
Numeric scalar. During collapsing of similar sequences, a low-frequency sequence will be collapsed with a higher-frequency sequence only if the ratio between the high-frequency and the low-frequency scores is at least this high. The default value of 0 indicates that no such check is performed. |
verbose |
Logical, whether to print progress messages. |
nameCol |
A character scalar providing the column of
|
A SummarizedExperiment
where
counts have been aggregated by the mutated amino acid(s).
Charlotte Soneson, Michael Stadler
se <- readRDS(system.file("extdata", "GSE102901_cis_se.rds",
package = "mutscan"))[1:200, ]
## The rows of this object correspond to individual codon variants
dim(se)
head(rownames(se))
## Collapse by amino acid
sec <- collapseMutantsByAA(se)
## The rows of the collapsed object correspond to amino acid variants
dim(sec)
head(rownames(sec))
## The mutantName column contains the individual codon variants that were
## collapsed
head(SummarizedExperiment::rowData(sec))
## Collapse similar sequences
sec2 <- collapseMutantsBySimilarity(
se = se, assayName = "counts", scoreMethod = "rowSum",
sequenceCol = "sequence", collapseMaxDist = 2,
collapseMinScore = 0, collapseMinRatio = 0)
dim(sec2)
head(rownames(sec2))
head(SummarizedExperiment::rowData(sec2))
## collapsed count matrix
SummarizedExperiment::assay(sec2, "counts")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.