View source: R/diffExpressedVariants.R
diffExpressedVariants | R Documentation |
Function that retrieves condition-specific variants in RNA-seq data.
diffExpressedVariants(countsData, conditions, pvalue = 1,
filterLowCountsVariants = 10, flagLowCountsConditions = 10,
technicalReplicates = FALSE,
nbCore = 1)
countsData |
a data frame containing the counts in the appropriate format (see Details below). |
conditions |
a character vector containing the experimental conditions. |
pvalue |
a numerical value indicating the p-value threshold below which the events will be kept in the final data frame. |
filterLowCountsVariants |
a numerical value indicating the global variant count value (see Details below) below which events are filtered out in order to increase statistical power of the analysis. Both variant must have a read coverage below this value in order to remove the event. This filter is done after the normalization and the overdispersion estimation. |
flagLowCountsConditions |
a numerical value indicating the global condition count value (see Details below) below which we flag events as 'lowCounts' in the final data frame. At least n-1 conditions (over n conditions) must have low counts to flag the event as 'lowCounts'. |
technicalReplicates |
a boolean value indicating if the counts in
|
nbCore |
an integer indicating the number of cores to use for the model fitting step. |
The countsData
data frame must be formatted as follows:
Column 1: names of the events
Column 2: lengths (in bp) of the variants
Column 3 to n: counts corresponding to each replicate of each experimental condition of one variant
Each row corresponds to one variant, thus an event correspond to two rows with
the longest variant (or inclusion variant) in the first row (thus denotated as
upper path: UP) and the smallest variant (or exclusion variant) in the second
row (thus denotated as lower path: LP).
This data frame can be obtained using kissplice2counts
function.\
The global variant count is the minimal number of reads that cover one or the
other variant across all the replicates (sum by variant).\
The global condition count is the minimal number of reads that cover one or
the other condition (sum by replicates for each conditions).
diffExpressedVariants
returns a list of 6 objects:
finalTable |
a data frame containing the columns
|
correctedPval |
a numeric vector containing p-values after correction for multiple testing |
uncorrectedPVal |
a numeric vector containing p-values before correction for multiple testing |
resultFitNBglmModel |
a data frame containing the results of the fitting of the model to the data |
f/psiTable |
a data frame containing the allele frequency (f)/Percent Spliced In (PSI) of each replicate |
k2rgFile |
a string containing either the |
Lopez-Maestre et al., 2016. Snp calling from rna-seq data without a reference genome: identification, quantification, differential analysis and impact on the protein sequence. Nucleic Acids Research, 44(19):e148. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/nar/gkw655")}
fpath1 <- system.file("extdata", "output_kissplice_SNV.fa", package = "kissDE")
mySNVcounts <- kissplice2counts(fpath1, counts = 0, pairedEnd = TRUE)
mySNVconditions <- c("EUR", "EUR", "TSC", "TSC")
# diffSNV <- diffExpressedVariants(mySNVcounts, mySNVconditions)
fpath2 <- system.file("extdata", "table_counts_alt_splicing.txt",
package = "kissDE")
mySplicingconditions <- c("C1", "C1", "C2", "C2")
mySplicingcounts <- read.table(fpath2, header = TRUE)
# diffSplicing <- diffExpressedVariants(mySplicingcounts, mySplicingconditions)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.