genotype-class | R Documentation |
The genotype objects are tailored to handle efficiently in R the contents of (mouse/human) vcf files, i.e. variant metadata, and several genotypes. They are also meant to be augmented with additional information such as annotations (e.g. vep) and single-cell analysis statistics.
## S4 method for signature 'genotype'
show(object)
## S4 method for signature 'genotype,ANY,ANY,ANY'
x[i, j, ..., drop = F]
## S4 replacement method for signature 'genotype,ANY,ANY,ANY'
x[i, j, ...] <- value
## S4 method for signature 'genotype'
x$name
## S4 method for signature 'genotype,ANY,ANY'
x[[i, j, ...]]
## S4 replacement method for signature 'genotype'
x$name <- value
## S4 method for signature 'genotype'
colnames(x)
## S4 replacement method for signature 'genotype'
colnames(x) <- value
## S4 method for signature 'genotype'
rownames(x)
## S4 replacement method for signature 'genotype'
rownames(x) <- value
## S4 method for signature 'genotype'
nrow(x)
## S4 method for signature 'genotype'
ncol(x)
## S4 method for signature 'genotype'
dim(x)
## S4 method for signature 'genotype'
rowSums(x)
## S4 method for signature 'genotype'
colSums(x)
For efficient handling, the genotypes are stored as a sparse numeric matrix. When creating the object with ReadVcf, the vartrix conventions (from 10X genomics) are used, namely: 0 for "no call" or in vcf "./." 1 for "ref/ref" or in vcf "0/0" 2 for "alt/alt" or in vcf "1/1" 3 for "ref/alt" or in vcf "0/1"
The objects can also be created manually typically with CreateGenotypeObject().
The genotype data is found in the matrix slot, and information concerning the variants is contained in the metadata slot. The list of variants currently in the object is found in the variants slot.
Additional slots can be populated with additional data relevant to single cell analysis: most informative variants (informative_variants slot), and variants sorted by coverage (variants_by_coverage slot) or by entropy (variants_by_information).
Standard generic methods can be used on the genotype object. In particular, the object can be subset with object[i,j] to obtain a genotype object with the corresponding subset of variants and genotypes (with full associated metadata, and restricted ranked variant lists). Objecti,j(,drop) enables to access the genotype matrix in read and write, as well as rowSums, colSums, rownames, colnames, nrow, ncol, and dim. The $ operator enables to access directly to metadata columns, for both read and write.
show(genotype)
: Show a summary of the contents of a genotype object.
x[i
: Access matrix values in a genotype object.
`[`(x = genotype, i = ANY, j = ANY) <- value
: Assign genotype matrix values in a genotype object
$
: Access metadata columns in a genotype object
x[[i
: Subset a genotype object
`$`(genotype) <- value
: Assign values to genotype object metadata columns
colnames(genotype)
: Retrieve column names (i.e. genotype names) from a genotype object.
colnames(genotype) <- value
: Assign column names (i.e. genotype names) to a genotype object.
rownames(genotype)
: Retrieve row names (i.e. variant names) from a genotype object.
rownames(genotype) <- value
: Assign row names (i.e. variant names) from a genotype object.
nrow(genotype)
: Retrieve the number of rows (i.e. number of variants) from a genotype object.
ncol(genotype)
: Retrieve the number of columns (i.e. number of genotypes) from a genotype object.
dim(genotype)
: Retrieve the dimension, i.e. number of variants and samples, from a genotype object.
rowSums(genotype)
: Compute the number of genotypes which cover a given variant, from a genotype object.
colSums(genotype)
: Compute the number of variants covered in each genotype, from a genotype object.
matrix
A dgCMatrix containing genotype data in vartrix conventions.
metadata
A data.frame containing variant metadata.
variants
A character vector listing the variants described in the object, in the same order as the rows of the matrix and metadata slots.
vartrix
A list of vartrix matrices
variants_by_coverage
A character vector listing the variants sorted from max coverage (number of cells in which there is data) to min.
variants_by_information
A character vector listing the variants sorted from max information (excess entropy in single cell data) to min.
informative_variants
A character vector listing the most informative variants, used for downstream clustering analysis.
genotypes <- ReadVcf("Myfolder/MyVcf.vcf")
genotypes # Preview the contents of the genotype object
genotypes[1:3,1:6] # Check the genotypes matrix
genotypes[1:2,1:5] <- 2 # Modify the genotypes matrix
head(genotypes@metadata) # Check the genotypes metadata
genotypes$CHROM[1:5] <- "chrZ" # Modify a metadata column
genotypes$CHROM[1:10] # Retrieve elements of a metadata column
genotypes$CHROM <- NULL # Delete a metadata column
colnames(genotypes)[1:3] <- c("a","b","c") # Rename some genotypes
colnames(genotypes) # Access genotype names
rownames(genotypes)[1:2] <- c("a","b") # Rename some variants
nrow(genotypes) # Get the number of variants
ncol(genotypes) # Get the number of genotypes
dim(genotypes) # Get both
genotypes@informative_variants <- genotypes@variants[1:20] # Set the informative variants list
genotypes[[1:10,1:5]] # Subset the genotype object
genotypes[[rownames(genotypes)[1:5],colnames(genotypes)[1:5]]] # Subset the object by variant and patient name (returns a genotype object)
genotypes[rownames(genotypes)[1:5],colnames(genotypes)[1:5]] # Access the genotype matrix by variant and patient name
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.