stat.mageck: Analysis: Analysis of pooled CRISPR screening data using a...
In caRpools: CRISPR AnalyzeR for Pooled CRISPR Screens

Description Usage Arguments Details Value Note Author(s) Examples

CaRpools also uses MAGeCK to look for enriched or depleted genes within your screening data. Please note that MAGeCK needs to be installed correctly, this can be tested by 'check.caRpools'.

Within this approach, the read counts of all sgRNAs in one dataset are first normalized by the function set in the MIACCS file. By default, normalization is done by read count division with the dataset median. Then, the fold change of each population of sgRNAs for a gene is tested against the population of either the non-targeting controls or randomly picked sgRNAs, as defined by the random picks option within the MIACCS file, using a two-sided Mann-Whitney-U test. P-values are corrected for multiple testing using FDR.

1
2
3

stat.mageck(untreated.list, treated.list, namecolumn=1, fullmatchcolumn=2,
norm.fun=median, extractpattern=expression("^(.+?)_.+"), mageckfolder=NULL,
sort.criteria="neg", adjust.method="fdr", filename=NULL, fdr.pval=0.05)

`untreated.list`	A list of untreated sample data frames of read-count data as created by load.file(). Default none Values A list of data frames of the untreated samples
`treated.list`	A list of treated sample data frames of read-count data as created by load.file(). Default none Values A list of data frames of the treated samples
`namecolumn`	In which column are the sgRNA identifiers? Default 1 Values column number (numeric)
`fullmatchcolumn`	In which column are the read counts? Default 2 Values column number (numeric)
`extractpattern`	PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in AAK1_107_0 it will extract AAK1, since this is the gene identifier beloning to this sgRNA identifier. Please see: Read-Count Data Files Default expression("^(.+?)(_.+)"), will work for most available libraries. Values PERL regular expression with parenthesis indicating the gene identifier (expression)
`sort.criteria`	MAGeCK argument –sort-criteria Default "neg" Values see MAGeCK documentation
`mageckfolder`	Folder for MAGeCK raw data output (internally used). Default NULL Value (character)
`filename`	Filename of raw MAGeCK data output. Default "data" Values (character)
`adjust.method`	Method to adjust p-value for multiple testing. See MAGeCK documentation. Default "fdr" Values see MAGeCK documentation
`fdr.pval`	FDR used for correction. Default 0.05 Values (numeric)
`norm.fun`	The mathematical function to normalize data. By default, the median is used. Default median Values Any mathematical function of R (function)

none

stat.mageck retrieves a list of two data frames. One with gene information, the other with sgRNA information.

none

Jan Winter

data(caRpools)
data.mageck = stat.mageck(untreated.list = list(CONTROL1, CONTROL2),
treated.list = list(TREAT1,TREAT2), namecolumn=1, fullmatchcolumn=2,
norm.fun="median", extractpattern=expression("^(.+?)(_.+)"),
mageckfolder=NULL, sort.criteria="neg", adjust.method="fdr",
filename = "TEST" , fdr.pval = 0.05)

knitr::kable(data.mageck$genes[1:10,])