carpools.read.distribution: QC: Plot Readcount Distribution
In caRpools: CRISPR AnalyzeR for Pooled CRISPR Screens

Description Usage Arguments Details Value Note Author(s) Examples

A distribution for NGS data readcount can be created by 'carpools.read.distribution' to visualize how the data set is distributed. This allows to check for data skewness and to estimate the overall assay quality. For further details see '?carpools.read.distribution'.

carpools.read.distribution(dataset,namecolumn=1, fullmatchcolumn=2, breaks="",
title="Title", xlab="X-Axis", ylab="Y-Axis",statistics=TRUE,
col=rgb(0, 0, 0, alpha = 0.65), extractpattern=expression("^(.+?)_.+"),
plotgene=NULL, type="distribution", logscale=TRUE)

`dataset`	Data frame of read-count data as created by load.file(). Default none Values A data frame
`namecolumn`	In which column are the sgRNA identifiers? Default 1 Values column number (numeric)
`fullmatchcolumn`	In which column are the read counts? Default 2 Values column number (numeric)
`breaks`	Histogramm breaks see '?hist'. By default, will be calculated according to the dataset length. Default NULL Values (numeric)
`title`	Main title of plot Default "Title" Values "The title you want" (character)
`xlab`	Label of X-Axis Default "X-Axis" Values "Label of X-Axis" (character)
`ylab`	Label of Y-Axis Default "Y-Axis" Values "Label of Y-Axis" (character)
`statistics`	Whether basic stattistics will be shown in the plot. Default TRUE Values TRUE, FALSE (boolean)
`col`	The color of the plotted data. Can be any R color or RGB object. See ?rgb() for further information. Default rgb(0, 0, 0, alpha = 0.65) Values Any R color name or RGB color object (character OR color object)
`extractpattern`	PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in AAK1_107_0 it will extract AAK1, since this is the gene identifier beloning to this sgRNA identifier. Please see: Read-Count Data Files Default expression("^(.+?)(_.+)"), will work for most available libraries. Values PERL regular expression with parenthesis indicating the gene identifier (expression)
`plotgene`	You can only plot the read count distribution of sgRNAs belonging to a certain gene, which is given to the function via plotgene. Default NULL Value NULL or gene identifier (character)
`type`	You can plot either the read count distribution either as a normal histogram, or a box-and-whisker plot. Default "distribution" Values "distribution" to plot a histogram, or "whisker" to plot a whisker plot (character)
`logscale`	Indicates whether the read-count is plotted in a logarithmic scale. Default TRUE Values TRUE, FALSE (boolean)

none

plot.read.distribution return a generic plot, that can be passed on to any device.

none

Jan Winter

data(caRpools)

carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE) 
  
carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE,
  type="whisker")