readSparseCounts: Read sparse count matrix from file
In LTLA/scuttle: Single-Cell RNA-Seq Analysis Utilities

readSparseCounts

R Documentation

Read sparse count matrix from file

Description

Reads a sparse count matrix from file containing a dense tabular format.

Usage

readSparseCounts(
  file,
  sep = "\t",
  quote = NULL,
  comment.char = "",
  row.names = TRUE,
  col.names = TRUE,
  ignore.row = 0L,
  skip.row = 0L,
  ignore.col = 0L,
  skip.col = 0L,
  chunk = 1000L
)

Arguments

`file`	A string containing a file path to a count table, or a connection object opened in read-only text mode.
`sep`	A string specifying the delimiter between fields in `file`.
`quote`	A string specifying the quote character, e.g., in column or row names.
`comment.char`	A string specifying the comment character after which values are ignored.
`row.names`	A logical scalar specifying whether row names are present.
`col.names`	A logical scalar specifying whether column names are present.
`ignore.row`	An integer scalar specifying the number of rows to ignore at the start of the file, before the column names.
`skip.row`	An integer scalar specifying the number of rows to ignore at the start of the file, after the column names.
`ignore.col`	An integer scalar specifying the number of columns to ignore at the start of the file, before the column names.
`skip.col`	An integer scalar specifying the number of columns to ignore at the start of the file, after the column names.
`chunk`	A integer scalar indicating the chunk size to use, i.e., number of rows to read at any one time.

Details

This function provides a convenient method for reading dense arrays from flat files into a sparse matrix in memory. Memory usage can be further improved by setting chunk to a smaller positive value.

The ignore.* and skip.* parameters allow irrelevant rows or columns to be skipped. Note that the distinction between the two parameters is only relevant when row.names=FALSE (for skipping/ignoring columns) or col.names=FALSE (for rows).

Value

A dgCMatrix containing double-precision values (usually counts) for each row (gene) and column (cell).

Author(s)

Aaron Lun

Examples

outfile <- tempfile()
write.table(data.frame(A=1:5, B=0, C=0:4, row.names=letters[1:5]), 
    file=outfile, col.names=NA, sep="\t", quote=FALSE)

readSparseCounts(outfile)

LTLA/scuttle documentation built on Oct. 28, 2024, 9:45 a.m.