summarizeSequenceStatus: Summarize sequence status
In LTLA/RepertoireUtils: Utility Functions for Analyzing Repertoire Sequencing Data

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/summarizeSequenceStatus.R

Obtain a quick summary of the status of the sequences for a particular component.

1	summarizeSequenceStatus(x, group = NULL)

`x`	A SplitDataFrameList where each DataFrame corresponds to a cell and each row in that DataFrame is a sequence in that cell.
`group`	Factor of length equal to `x` indicating the group to which each cell belongs.

By default, this assumes that the fields are named in the same manner as the annotation files produced by CellRanger. Future iterations will provide support for more standardized formats like AIRR.

If group=NULL, a DataFrame is returned with one row per cell, indicating whether that cell has:

any sequence
multiple sequences
any productive sequence
any full-length sequence
any high-confidence sequence
any awesome (productive, full-length and high-confidence) sequence

If group is specified, the DataFrame instead contains one row per level of group. Each value then represents the proportion of cells in that group with any sequence, multiple sequences, etc.

Aaron Lun

countSequencesPerCell, for which this function is a wrapper.

df <- data.frame(
    cell.id=sample(LETTERS, 30, replace=TRUE),
    clonotype=sample(paste0("clonotype_", 1:5), 30, replace=TRUE),
    full_length=sample(c("True", "False"), 30, replace=TRUE),
    high_confidence=sample(c("True", "False"), 30, replace=TRUE),
    productive=sample(c("True", "False"), 30, replace=TRUE),
    umi=pmax(1, rpois(30, 5))
)

Y <- splitDataFrameByCell(df, "cell.id")
summarizeSequenceStatus(Y)

summarizeSequenceStatus(Y, group=sample(1:3, nrow(df), replace=TRUE))