CreateGenotypeObject: Create a genotype object

View source: R/CreateGenotypeObject.R

CreateGenotypeObjectR Documentation

Create a genotype object

Description

The genotype objects are tailored to handle efficiently in R the contents of (mouse/human) vcf files, i.e. variant metadata, and several genotypes. They are also meant to be augmented with additional information such as annotations (e.g. vep) and single-cell analysis statistics.

Usage

CreateGenotypeObject(
  matrix = NULL,
  metadata = NULL,
  vartrix = list(),
  variants = NULL,
  ...
)

Arguments

matrix

A dgCMatrix containing genotype data in vartrix conventions.

metadata

A data.frame containing variant metadata.

vartrix

A list of vartrix matrices

variants

A character vector listing the variants described in the object, in the same order as the rows of the matrix and metadata slots. Leave NULL to use the rownames of the matrix and metadata (They should match though).

variants_by_coverage

A character vector listing the variants sorted from max coverage (number of cells in which there is data) to min.

variants_by_information

A character vector listing the variants sorted from max information (excess entropy in single cell data) to min.

informative_variants

A character vector listing the most informative variants, used for downstream clustering analysis.

Details

For efficient handling, the genotypes are stored as a sparse numeric matrix. When creating the object with ReadVcf, the vartrix conventions (from 10X genomics) are used, namely: 0 for "no call" or in vcf "./." 1 for "ref/ref" or in vcf "0/0" 2 for "alt/alt" or in vcf "1/1" 3 for "ref/alt" or in vcf "0/1"

The objects can instead be created manually with CreateGenotypeObject().

The genotype data is found in the matrix slot, and information concerning the variants is contained in the metadata slot. The list of variants currently in the object is found in the variants slot.

Additional slots can be populated with additional data relevant to single cell analysis: most informative variants (informative_variants slot), and variants sorted by coverage (variants_by_coverage slot) or by entropy (variants_by_information).

Standard generic methods can be used on the genotype object. In particular, the object can be subset with object[i,j] to obtain a genotype object with the corresponding subset of variants and genotypes (with full associated metadata, and restricted ranked variant lists). Objecti,j(,drop) enables to access the genotype matrix in read and write, as well as rowSums, colSums, rownames, colnames, nrow, ncol, and dim. The $ operator enables to access directly to metadata columns, for both read and write.

Value

Returns a genotype object.


nbroguiere/burgertools documentation built on Jan. 30, 2024, 3:48 a.m.