makeCountMatrix: Make a count matrix
In DropletUtils: Utilities for Handling Single-Cell Droplet Data

Description Usage Arguments Details Value Author(s) See Also Examples

Construct a count matrix from per-molecule information, typically the cell and gene of origin.

1	makeCountMatrix(gene, cell, all.genes=NULL, all.cells=NULL, value=NULL)

`gene`	An integer or character vector specifying the gene to which each molecule was assigned.
`cell`	An integer or character vector specifying the cell to which each molecule was assigned.
`all.genes`	A character vector containing the names of all genes in the dataset.
`all.cells`	A character vector containing the names of all cells in the dataset.
`value`	A numeric vector containing values for each molecule.

Each element of the vectors gene, cell and (if specified) value contain information for a single transcript molecule. Each entry of the output matrix corresponds to a single gene and cell combination. If multiple molecules are present with the same combination, their values in value are summed together, and the sum is used as the entry of the output matrix.

If value=NULL, it will default to a vector of all 1's. Each entry of the output matrix represents the number of molecules with the corresponding combination, i.e., UMI counts. Users can pass other metrics such as the number of reads covering each molecule. This would yield a read count matrix rather than a UMI count matrix.

If all.genes is not specified, it is kept as NULL for integer gene. Otherwise, it is defined as the sorted unique values of character gene. The same occurs for cell and all.cells.

If gene is integer, its values should be positive and no greater than length(all.genes) if all.genes!=NULL. If gene is character, its values should be a subset of those in all.genes. The same requirements apply to cell and all.cells.

A sparse matrix where rows are genes, columns are cells and entries are the sum of value for each gene/cell combination. Rows and columns are named if the gene or cell are character or if all.genes or all.cells are specified.

Aaron Lun

read10xMolInfo

nmolecules <- 100
gene.id <- sample(LETTERS, nmolecules, replace=TRUE)
cell.id <- sample(20, nmolecules, replace=TRUE)
makeCountMatrix(gene.id, cell.id)