ReadVartrix: Load vartrix data

View source: R/ReadVartrix.R

ReadVartrixR Documentation

Load vartrix data

Description

Read the vartrix REF and ALT matrices, and computes frequency (FREQ) and consensus (VAR) matrices. Adds them to a genotype object already containing the reference vcf data.

Usage

ReadVartrix(
  genotype,
  ref,
  alt,
  barcodes = c(),
  tolerance.percent = 5,
  strip.suffix = TRUE
)

Arguments

genotype

A genotype object, which contains the reference vcf file used to compute vartrix matrices.

ref

character(1), the file name for the reference allele count matrix.

alt

character(1), the file name for the alternative allele count matrix.

barcodes

character(n), the file name for the cell barcodes, or the list of cell barcodes for which vartrix was run, used as column names on the matrices. c() for no names. Default: c().

tolerance.percent

numeric(1), the percentage of alt or ref counts not in agreement with the rest on a unique (cell,variant) tolerated without affecting the call. Default: 5%.

strip.suffix

logical(1), should the "-X" suffix be striped from the barcodes if constant. Default: TRUE.

Details

The FREQ sparse matrix is offset by 1 in order to efficiently distinguish zero-frequency (value 1) from missing data (value 0, not stored in the sparse matrix).

The consensus matrix makes a call for no data (value 0, not stored in the sparse matrix), ref/ref (value 1), alt/alt (value 2) or alt/ref (value 3) genotypes. Since in scRNA-seq data, there is likely some extracellular RNA from dead cells or debris slightly contaminating the reads of other cells, a tolerance can be set above 0. For example, at a tolerance of 5%, up to 1 read in 20 can differ from the rest without making the call switch to heterozygous. The behavior of the original vartrix consensus matrix calculation corresponds to a tolerance of 0.

Equivalence between vartrix, human, and vcf genotype naming conventions:

"0" for "no call", "./."

"1" for "ref/ref", "0/0"

"2" for "alt/alt", "1/1"

"3" for "ref/alt", "0/1"

Value

Returns the genotype object with a populated vartrix slot, containing a list of sparse matrices (dgCmatrix), with names REF, ALT, FREQ and VAR (=consensus). Variants are the rows and cells/barcodes the columns.

Examples

MyGenotypes <- ReadVartrix(MyGenotypes, "vartrix_ref_matrix.mtx.gz", "vartrix_alt_matrix.mtx.gz", barcodes=MyCellBarcodes, tolerance.percent=2)

nbroguiere/burgertools documentation built on Jan. 30, 2024, 3:48 a.m.