previewDipDistribution: Plots the distribution of the gene set's dip values and dip...
In jsieker/DipEx: Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

Description Usage Arguments Value Author(s) References Examples

Allows the viewing of where the dataset's dip values and dip p-values are distributed to aid in the assignment of region break lines and the number of regions.

1	previewDipDistribution(RNAdata, rawRNAdata, minimumCounts, barLine, adjustVal)

`RNAdata`	This argument specifies the RNAseq dataset to be analyzed. It may be in the form of either raw or normalized data. If the analysis is to involve a 'minimumCounts' screen to filter out low-expression genes, then 'RNAdata' should specify the normalized expression data. The rows must be genes (with gene names as row names) and the columns must be different samples (ideally, with the sample names as the column names, but this specific exemption will not disable the program). Columns with non-numerical data (or containing data not relating to a sample) should be specifically exempted before any analysis is attempted.
`rawRNAdata`	This is an optional argument used only when a 'minimumCounts' filter is to be applied. Each gene's highest expression level is extracted from 'rawRNAdata'. If that maximum expression does not exceed the value supplied by 'minimumCounts', then that gene will be exempted from the analysis of the normalized counts. It is crucial to note that this is not the dataset to be analyzed. This set serves as part of an optional filter. The rows must be genes (with gene names as row names) and the columns must be the samples, both of which should correspond directly with the rows and columns of the normalized data supplied as 'RNAdata'. Columns with non-numerical data (or containing data not relating to a sample) should be specifically exempted before any analysis is attempted.
`minimumCounts`	If 'rawRNAdata' is supplied, 'minimumCounts' is the threshold that each gene's maximum raw expression value must exceed to remain in the normalized RNA data for the analysis.
`barLine`	This selects the x-intercept of the bar that can be overlaid on the graph. It will default to 0 if no value is supplied.
`adjustVal`	ggplot2 value for adjusting the sharpness of the resulting plots: ranges from 0 to 1. The default is 1.

This function returns a dataframe (DF) with the each gene's dip values and dip p-values, a density plot of the genes' dip values (DipPlot), and a density plot of the genes' dip p-values (PvalPlot).

Software authors: Jeremy Sieker, Sohyon Lee, Kristin Baldwin

Martin Maechler (2016). diptest: Hartigan's Dip Test Statistic for Unimodality - Corrected. R package version 0.75-7. https://CRAN.R-project.org/package=diptest

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.

x <- paste("https://www.ebi.ac.uk/gxa/experiments-content",
  "/E-GEOD-70484/resources/BaselineProfilesWriterService.RnaSeq/tsv", sep = "")
ss <- read.table(url(x),sep = '\t', header = TRUE)
ss <- ss[(as.numeric(row.names(ss)) %% 2 == 1),]
#if the previous line fails, remove the slashes from the modulo division and try again
row.names(ss) <- ss$Gene.ID #cutting it in half to speed up the examples
ss <- ss[,-c(1:2)] #removing everything that is not expression data

mod <- previewDipDistribution(RNAdata = ss)


#in many datasets, genes with extremely low dip values (or high dip p values)
#will appear in their expression plots as normal distributions with a mean around zero.
#These are typically just genes that don't have registered counts in any of the samples.
#To remove these genes, there are a few options.
#One can apply both normalized and raw counts (i.e.- supplying both the RNAdata
#and rawRNA arguments) and employ the 'minimumCounts' filter.
#Alternatively, pre-filter your data for genes that do not pass
# your desired expression threshold, then simply
#use that data for your RNAdata argument and leave the rawRNAdata and
# minimumCounts arguments blank.

#Due to the difficulty of finding publicly available datasets that
# have paired raw and normalized counts, the filter will not be
#demonstrated in this example. However, if you were to have a raw counts set
# called raws and a normalized counts set called ss, the
#code would be along the lines of

#mod1 <- previewDipDistribution(RNAdata = ss, rawRNAdata = raws, minimumCounts = 50)

jsieker/DipEx documentation built on May 17, 2019, 2:10 p.m.

jsieker/DipEx index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jsieker/DipEx
Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

previewDipDistribution: Plots the distribution of the gene set's dip values and dip...
In jsieker/DipEx: Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to previewDipDistribution in jsieker/DipEx...

R Package Documentation

Browse R Packages

We want your feedback!

jsieker/DipEx Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

previewDipDistribution: Plots the distribution of the gene set's dip values and dip... In jsieker/DipEx: Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to previewDipDistribution in jsieker/DipEx...

R Package Documentation

Browse R Packages

We want your feedback!

jsieker/DipEx
Analysis of RNAseq data distributions through visualization and Hartigan's Dip test

previewDipDistribution: Plots the distribution of the gene set's dip values and dip...
In jsieker/DipEx: Analysis of RNAseq data distributions through visualization and Hartigan's Dip test