cnr: cnr: copy number, rounded

cnrR Documentation

cnr: copy number, rounded

Description

The cnr object is a list of four relational matrices. The bins, genes, annotation, qc, chromInfo, and gene.index. The structure is inspired by Scanpy's AnnData which cleverly integrates complex data into a simple architecture.

The object stores processing results required for data exploration, visualization and genetic analyses. Specifically, the distance matrix, phylogenetic analysis, pseudobulk, and heterogeneity analysis.

Usage

data(cnr)

Format

An object class list containg a rounded CNR

  • X, An integer matrix of bins x n.cells containing copy number estimations for each bin[i] and cell[j] Where bins represent a common genomic segment across all cells (either a fixed with, variable binning, or .seg data). This data can be constructed using a variable length bin and CBS (Varbin algorithm) (Baslan et al 2012.), and implemented on Ginkgo, or from hmmCopy. These are upstream analyses to the package.

    For mutations in single-cells, the cnq can be a binary incidence (0,1) matrix representing presence or absence of specific mutations, or a ternary (0,1,2) representing genotypes as the number of alternate allele copies

    We noticed that estimates for deletions and losses don't follow standard numeric rounding. We set a the thresholds of < 0.2 (average quantal estimate of the Y chromosome in females) for deletions (i.e. 0 copies); between 0.2 and 1.2 for losses (i.e. 1 copy); between 1.2 and 2.5 for 2 copies, and everything else standard numeric rounding.

  • genes, gene copy number interpolation from bins. The genes matrix is an interpolated, transposed, expansion of bins. The expansion is constructed internally using the expand2genes function.

  • Y, phenotypic data of single-cells, contains cells as rows, and phentypes in columns. Phenotypes can be information about individual samples, or if same-cell methods were used, the RNA expression from the same cell.

  • qc, quality control metrics. This matrix contains additional metadata that is technical, e.g. number of reads, MAPD estimates, and the PASS/FAIL qc.status for individual cells. contains cells as rows and metadata as columns

  • chromInfo, ordered chromosome information for the bins

  • gene.index, table to map bins to genes

  • cells, a list of cells

  • bulk, logical, weather the data is bulk DNA or cells. If TRUE, data will not be rounded and it's assumed is log ratio data. If FALSE, data is considered as single-cell and copy numbers are considered as integer copy number. Estimates are rounded to the nearest integer (see above). ...

Value

  • cdb, pairwise cell dissimilarity using Bray-Curtis

  • hcdb, heirarchical clustering of cells based

  • phylo, cell phyogenetic tree. Analysis is produced with ape. Default is "balanced minimum evolution"

  • tree.height, height cutoff of the tree, set as intersection between the total number of multi-cell clusters and one-cell clusters

  • ccp, output from ConsensusClusterPlus, the first element is a color map. Elements 2:(40) contain 4 outputs: consensusMatrix, consensusTree, consensusClass, ml, and clrs. Each element corresponds to k in k-means clustering

  • kStats, spectral analysis of consensus clustering

  • eigenVals, spectral analysis eigen values

  • optK optimumn k-parameter (kCC), and stable K (sK)

  • cluster_heterogeneity, summary metrics of clones/clusters

  • uclust, unique clusters with 3 or more cells

  • DDRC.df, bin pseudobulk profiles of each clone (or other chosen group)

  • DDRC.g, gene pseudobulk profiles

  • vdj.cells list of vdj.cells produced by genotype_vdj

  • DDRC.dist, bray curtis disimilarity of clones

  • DDRC.phylo, phylogenetic analysis of clones

Source

https://github.com/SingerLab/gac"


SingerLab/gac documentation built on March 23, 2024, 5:15 a.m.