isoDeconvMM: Cell Type Deconvolution using RNA Isoform-Level Expression
In hheiling/IsoDeconvMM: Performs Cell Type Deconvolution using Differential Isoform Information

Calculates the proportions of pure cell type components in heterogeneous cell type samples of RNA-seq data utilizing isoform-level expression differences

IsoDeconvMM(directory = NULL, mix_files, pure_ref_files, fraglens_files,
  bedFile, knownIsoforms, discrim_genes, readLen, lmax = 600,
  eLenMin = 1, mix_names = NULL, initPts = NULL,
  optim_options = optimControl())

`directory`	an optional character string denoting the path to the directory where all of the mix_files, pure_ref_files, fraglens_files, and bedfile are located. The working directory is set as this directory. If this directory is left 'NULL', then all of the relevent files must either (a) be located in the current working directory or (b) have their full path specified.
`mix_files`	a vector of the file names for the text files recording the number of RNA-seq fragments per exon set, which should have 2 columns "count" and "exons", without header. For example: `37 chr18_109\|ENSMUSG00000024491\|4;chr18_109\|ENSMUSG00000024491\|5; 17 chr18_109\|ENSMUSG00000024491\|5; 88 chr18_109\|ENSMUSG00000024491\|5;chr18_109\|ENSMUSG00000024491\|6;` There should be one file for each of the samples containing mixtures of cells. The second column lists exon sets, where “chr18_109” indicates a transcript cluster, “ENSMUSG00000024491” is the ensemble gene ID, and the numbers at the end is the exon ID Directions to create these count files can be found in the Step_0_Processes directory of the GitHub repo hheiling/deconvolution <https://github.com/hheiling/deconvolution>
`pure_ref_files`	a matrix where the first column is the file names for the text files recording the number of RNA-seq fragments per exon set (see 'mix_files' for additional description), one for each of the pure reference cell type samples (again, see the Step_0_Processes directory in <https://github.com/hheiling/deconvolution> for directions on how to create these files) and the second column contains the character names of the pure cell type associated with each sample
`fraglens_files`	a vector of the file names for the text files recording the distribution of the fragment lengths, which should have 2 columns: "Frequency" and "Length", without header. For example: `20546 75 40465 76 37486 77 27533 78 25344 79` Directions to create these fragment length files are also available in the Step_0_Processes directory in the GitHub repo hheiling/deconvoltuion, <https://github.com/hheiling/deconvolution>
`bedFile`	file name of the .bed file recording information of non-overlapping exons, which has 6 colums: "chr", "start", "end", "exon", "score", and "strand", without header. For example: `chr1 3044314 3044814 ENSMUSG00000090025:1 666 + chr1 3092097 3092206 ENSMUSG00000064842:1 666 +` Directions to create this .bed file can be found in the Create_BED_knownIsoforms_Files directory in the GitHub repo hheiling/deconvolution, <https://github.com/hheiling/deconvolution>
`knownIsoforms`	character string for the name of an .RData object that contains the known isoform information. When loaded, this object is a list where each component is a binary matrix that specifies a set of possible isoforms (e.g., isoforms from annotations). Specifically, it is a binary matrix of k rows and m columns, where k is the number of non-overlapping exons and m is the number of isoforms. isoforms[i,j]=1 indicates that the i-th exon belongs to the j-th isoform. For example, the following matrix indicates the three isoforms for one gene ENSMUSG00000000003: `ENSMUST00000000003 ENSMUST00000166366 ENSMUST00000114041 [1,] 1 1 1 [2,] 1 1 1 [3,] 1 1 1 [4,] 1 1 0 [5,] 1 1 1 [6,] 1 1 1 [7,] 1 1 1 [8,] 1 0 0` Instructions for creating such an RData object can be found in the Create_BED_knownIsoforms_Files directory in the GitHub repo hheiling/deconvolution, <https://github.com/hheiling/deconvolution>
`discrim_genes`	vector of genes that are suspected to have differential gene expression. This gene list could come from CuffLinks output, `isoform` package output, or something similar.
`readLen`	numeric value of the length of a read in the RNAseq experiment
`lmax`	numeric value of the maximum fragment length of the experiment
`eLenMin`	numeric value of the minimum value of effective length. If the effective length of an exon or exon junction is smaller than eLenMin, i.e., if this exon is not included in the corresponding isoform, set it to eLenMin. This is to account for possible sequencing error or mapping errors.
`mix_names`	an optional vector of the desired nicknames of the mixture samples corresponding, in the same order, to the mix_files list. If left as the default `NULL` value, the nicknames used will be the names given in the mix_files minus the .txt extension
`initPts`	an optional matrix of initial probability estimates for the cell composition of the mixture samples to be used in the optimization procedure. The matrix should have J columns, where J = number of pure cell types of interest. Each row corresponds to different combinations of initial probability values. The column names of the matrix must be provided and must correspond to the pure cell type names given in the second column of the pure_ref_files object (no particular ordering needed)
`optim_options`	a list inheriting from class `optimControl` containing optimization control parameters. See the function `optimControl` for more details.

A list object with the following structure: first layer of list has elements associated with each of the mixture samples; second layer of list as elements associated with each transcript cluster used in the analysis, determined by the genes in the discrim_gene vector. Each of these transcript cluster elements is itself a list with the following elements:

`info`
`candiIsoform`
`I`	Number of isoforms utilized in transcript cluster
`E`	Number of exons in transcript cluster
`X`	ExI matrix of effective lengths for each of the E exon sets within each of the I isoforms
`info_status`
`y_mix, other y vectors for each pure cell type reference sample`	Ex1 vectors of read count at each exon set for the given mixture or pure cell type sample
`countN_mix, other countN values for each pure cell type reference sample`
`mix`	a list with the elements rds_exons_t (vector of length E+1 where the last E elements are y_mix, and the first element is the total read counts for the mixture sample minus the sum of y_mix), gamma.est ((I-1)xK matrix of isoform expression parameters for each cell type k), tau.est (vector of length K of gene expression parameters in cell type k), p.est (vector of length K containing estimated proportions based on the given transcript cluster), and pm.rds.exons (ExK matrix containing posterior means for each of E exon sets in each of K cell types)
`"cellType1","cellType2" ...`
`l_tilde`	Ix1 vector of total effective lengths of each of the I isoforms; Each elemement of the vector, denoted l_i, is a column sum from the matrix `X`
`X.fin`	edited design matrix for new gamma parameters, where the ith column of the new matrix is `X.fin_i = (X_i-[l_i/l_I]X_I)` for i = 1,...,(I-1) and `X.fin_I = X_I/l_I`
`X.prime`	first (I-1) columns X.fin pertaining to gamma parameters
`alpha.est`	IxK hyperparameters governing average isoform expression levels and variances within cells of type k
`beta.est`	2xK hyperparameters governing gene expression levels within cells type k
`CellType_Order`	For outputs giving K different estimates for each of the K cell types, these outputs are ordered with respect to CellType_Order
`WARN`	An integer indicating the following information: `0 - Optimization Complete 1 - Iteration Limit Reached 4 - Error in Optimization Routine (Error in mixture sample fit) 5 - Optimization not conducted (Error in pure sample fit)`

hheiling/IsoDeconvMM documentation built on March 11, 2020, 7:28 p.m.

hheiling/IsoDeconvMM index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hheiling/IsoDeconvMM
Performs Cell Type Deconvolution using Differential Isoform Information

isoDeconvMM: Cell Type Deconvolution using RNA Isoform-Level Expression
In hheiling/IsoDeconvMM: Performs Cell Type Deconvolution using Differential Isoform Information

Description

Usage

Arguments

Value

Related to isoDeconvMM in hheiling/IsoDeconvMM...

R Package Documentation

Browse R Packages

We want your feedback!

hheiling/IsoDeconvMM Performs Cell Type Deconvolution using Differential Isoform Information

isoDeconvMM: Cell Type Deconvolution using RNA Isoform-Level Expression In hheiling/IsoDeconvMM: Performs Cell Type Deconvolution using Differential Isoform Information

Description

Usage

Arguments

Value

Related to isoDeconvMM in hheiling/IsoDeconvMM...

R Package Documentation

Browse R Packages

We want your feedback!

hheiling/IsoDeconvMM
Performs Cell Type Deconvolution using Differential Isoform Information

isoDeconvMM: Cell Type Deconvolution using RNA Isoform-Level Expression
In hheiling/IsoDeconvMM: Performs Cell Type Deconvolution using Differential Isoform Information