readToUmiPerCell: Read to UMI per cell

Description Usage Arguments Details Value Author(s) Examples

View source: R/readToUmiPerCell.R

Description

Compute the read-to-UMI ratio for each cell.

Usage

1
readToUmiPerCell(x, read.field, umi.field)

Arguments

x

A SplitDataFrameList where each DataFrame is a cell and each row is a sequence.

read.field

String containing the name of the column containing the read count data.

umi.field

String containing the name of the column containing the UMI count data.

Details

This function is designed to evaluate the degree of redundancy in the read coverage of each UMI. High values indicate that the reads are highly redundant such that little can be gained from further sequencing.

Note that, in repertoire data, the definition of “high” is somewhat different from usual. This is because only deeply sequenced transcripts will survive the assembly and annotation process, such that the reported sequences are likely to be biased towards very high read-to-UMI ratios. Values around 1000 seem to be typical.

If a cell has multiple sequences, their counts are simply added together across sequences to compute the per-cell ratio.

Value

A numeric vector containing the ratio of reads to UMI for each cell.

Author(s)

Aaron Lun

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
df <- data.frame(
    cell.id=sample(LETTERS, 30, replace=TRUE),
    v_gene=sample(c("TRAV1", "TRAV2", "TRAV3"), 30, replace=TRUE),
    j_gene=sample(c("TRAJ4", "TRAJ5", "TRAV6"), 30, replace=TRUE),
    reads=rnbinom(30, mu=20, size=0.5),
    umis=rnbinom(30, mu=2, size=1)
)

y <- splitDataFrameByCell(df, field="cell.id")
readToUmiPerCell(y, "reads", "umis")

LTLA/RepertoireUtils documentation built on Feb. 9, 2020, 12:51 p.m.