R/RRAalpha.R
In gCrisprTools: Suite of Functions for Pooled Crispr Screen QC and Analysis

Documented in ct.alphaBeta ct.makeRhoNull ct.numcores ct.RRAalpha ct.RRAalphaBatch ct.RRAaPvals

## This document contains functions for the RRAa algorithm. 

##' @title Checks and Possibly Sets the Number of Cores to be Used in Parallel Processing
##' @description This function determines the number of cores that the user is expecting to 
##' use during parallel processing operations, and if absent, sets the \code{mc.cores} option
##' to the maximum value. Users who do not wish to use all available cores during parallel 
##' processing should do so by invoking \code{options()} from the command line prior to analysis. 
##' 
##' @return Nothing, but invisibly sets \code{options(mc.cores)} if currently \code{NULL}. 
##' @keywords internal
##' @author Russell Bainer, Pete Haverty
ct.numcores <- function()  {
    .Deprecated()
    if( .Platform$OS.type == "windows" ){
        options(mc.cores = 1)
    }
    if(is.null(getOption('mc.cores'))){
        numcores = detectCores()
        options(mc.cores = numcores)
    }
    invisible()
}

##' @title Aggregation of P-value Ranks using a Beta Distribution and Alpha Cutoff  
##' @description This function calculates an alpha-modified rho statistic from a set of normalized ranks by comparing them to a uniform distribution. 
##' Specifically, the ranks are ordered and p-values calculated at each position in the ordered vector by comparison to a Beta distribution. The rho value 
##' returned is the smallest p-value identified in this way across all scores. Should not be used by end users.
##' @param p.in A single column matrix of rank scores, with row.names indicating the gRNA labels.
##' @return A numeric rho value corresponding to the minimum rank order P. 
##' @author Russell Bainer, modified from code from the \code{RobustRankAggreg} package. 
##' 
##' Citation: 
##' Kolde, R. et al, Bioinformatics. 2012 Feb 15;28(4):573-80. doi: 10.1093/bioinformatics/btr709.
##' @import RobustRankAggreg
##' @keywords internal
##' @examples
##' library(RobustRankAggreg)
##' testp <- runif(20)
##'
##' min(betaScores(testp))
##' ct.alphaBeta(testp)  
##' @export
ct.alphaBeta <- function(p.in){ 
    n <- length(p.in)  
    return(min(pbeta(p.in, 1:n, n - (1:n) + 1)))
}


##' @title Aggregation of P-value Ranks using a Beta Distribution and Alpha Cutoff  
##' @description This function is called internally as a single instance of the beta aggregation step in RRAa. Users should not interact with it directly. 
##' The expected input is a list of rank statistics, and a paired \code{alpha} argument defining which values to consider in downstream analyses (see below). 
##' @param p A single column matrix of rank statistics, with \code{row.names} indicating the gRNA labels. 
##' @param g.key data.frame with guide and gene names
##' @param shuffle Logical indicating whether to shuffle the rank statistics prior to calculating the rho statistics (useful for permutation).
##' @return Nothing, or a named list of target-level P-values, which are treated as a rho statistic in the permutation step. 
##' @author Russell Bainer
##' @keywords internal
##' @examples 
##' data('fit')
##' data('ann')
##' geneScores <- ct.RRAalpha(fit$p.value, ann, shuffle = FALSE)
##' @export
ct.RRAalpha <- function(p, g.key, shuffle = FALSE) {
    stopifnot(identical(rownames(p),rownames(g.key)))

    if(shuffle){
        p <- sample(p, nrow(p))
    }

    symbol = g.key$geneSymbol
    ord = order( symbol, p )
    p.collect <- split(p[ord], symbol[ord])
    rhoscores <- vapply(p.collect, ct.alphaBeta, numeric(1))
    return(rhoscores)
}  

##' @title gRNA signal aggregation via RRAa
##' @description This is a wrapper function implementing the RRAalpha p-value aggregation algorithm. Takes in a set of gRNA rank scores (formatted as a single-column 
##' numeric matrix with row.names indicating the guide names) and a list object of gRNA annotations (names are the gene targets, and each element of the list contains 
##' a vector of the corresponding guide names). The rank scores are converted to gene-level statistics that are thenm transformed into empirical p-values by permutation. 
##' @param p A single column matrix of ranking scores, with row.names indicating the gRNA labels
##' @param g.key An annotation data frame of gRNAs, minimally containing a factorized "geneSymbol" column indicating the target names. This is typically generated by calling the \code{ct.buildKeyFromAnnotation()} function.  
##' @param permute Number of permutations to be used during empirical p-value estimation.
##' @param permutation.seed numeric seed for permutation reproducibility.
##'   Default: \code{NULL} means to not set any seed.
##' @param multi.core Deprecated, does nothing
##' @return A named list of target-level empirical P-values. 
##' @author Russell Bainer
##' @keywords internal
##' @examples 
##' data('fit')
##' data('ann')
##' genePvals <- ct.RRAaPvals(fit$p.value, ann, permute = 100)
##' @export
ct.RRAaPvals <- function(p, g.key, permute, permutation.seed = NULL, multi.core=NULL) {
    
    ##Input checks
    if (!is.null(multi.core)) {
        warning("The multi.core argument of ct.RRAaPvals is deprecated. This function is now sufficiently fast on one core.")
    }
    if( ! (is.matrix(p) && ncol(p) == 1) ) {
        stop("P-values should be input as a single-column matrix with row names contained in the gs.list")
    }
    if(ncol(p) > 1) {
        p = p[,1,drop=FALSE]
        warning(paste('Multiple columns are present in the p-value matrix. Using the first column:',colnames(p)))
    }
    if(!is.data.frame(g.key)){stop("The annotation provided must be a data frame.")}
    if(!("geneSymbol" %in% names(g.key))){stop("The provided annotation does not contain a geneSymbol column.")}
    if(!identical(row.names(g.key), row.names(p))){stop("Provided p-value list have mismatched names.")}

    if (anyNA(p)) {
        notna = ! is.na(p)
        p = p[ notna ]
        g.key = g.key[ notna, ]
    }
    
    is.null(permutation.seed) || is.numeric(permutation.seed) ||
        stop("'permutation.seed' must be numeric or NULL.")
    set.seed(permutation.seed) # default NULL will have no effect

    ##Everything apparently ok, generate P-values. 
    obs <- ct.RRAalpha(p, g.key)

    message(paste("Permuting", permute, 'times, this may take a minute ...'))

    n.guides = table(g.key$geneSymbol)
    n.guides.table = sort.int(unique(n.guides))
    rho.nulls = lapply(n.guides.table,ct.makeRhoNull,p,permute)
    names(rho.nulls) = as.character(n.guides.table)
    stopifnot(identical( names(n.guides), names(obs) ))
    out = unlist(mapply(
        obs, as.character(n.guides),
        FUN=function(x,y) {
            mean( rho.nulls[[y]] <= x )
        }, SIMPLIFY=FALSE),recursive=FALSE)

    return(out)
}

##' Make null distribution for RRAalpha tests
##'
##' Makes random distribution of Rho value by taking nperm random samples of n rank stats, p.
##' @param n single integer, number of guides per gene
##' @param p numeric vector of rank statistics
##' @param nperm single integer, how many random samples to take.
##' @return numeric vector of Rho values
##' @export 
ct.makeRhoNull <- function(n,p,nperm) {
    message(paste("Making Rho null distribution for",n,"guides per gene."))
    vapply(1:nperm,
           function(i) {
               p.in = sort.int(sample(p,n,replace=FALSE))
               ct.alphaBeta(p.in)
           },
           numeric(1)
           )
}

##' @title Create Batches of Null Permutations for a Crispr Screen
##' @description This is a wrapper function to partition batches of calls to \code{ct.RRAalpha()} for multicore processing. 
##' It is called internally as a single instance of the beta aggregation step in RRAa. Users should not interact with it directly. 
##' @param p A single column matrix of rank statistics, with row.names indicating the gRNA labels. 
##' @param g.key data.frame with guide and gene names
##' @param result.environment The target environment containing the quasi-global variables incremented during the permutations in the child functions. 
##' @param batch.size Number of iterations to deploy to each daughter process. 
##' @param permutation.seed numeric seed for permutation reproducibility. Default is \code{NULL}, in which case no seed is set.
##' @return An integer vector indicating the number of iterations in which each gene's score was better than those indicated in \code{result.environment$obs}. 
##' @keywords internal
##' @author Russell Bainer
##' @export
ct.RRAalphaBatch <- function(p, g.key, result.environment, batch.size = 100, 
                             permutation.seed = NULL){

    .Deprecated("ct.makeRhoNull",msg="Constructing null distributions is now fast, so ct.RRAalphaBatch is unnecessary.")
    ##make a batch environment
    batch.env <- new.env()
    batch.env$obs <- result.environment$obs
    batch.env$target.positive.iterations <- rep.int(0, result.environment$ngenes)
    ## run permutations and increment as needed
    set.seed(permutation.seed) # default NULL will have no effect

    invisible(replicate(batch.size, ct.RRAalpha(p, g.key, shuffle = TRUE)))
    
    return(batch.env$target.positive.iterations)
}

Any scripts or data that you put into this service are public.

gCrisprTools documentation built on Nov. 8, 2020, 8:17 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gCrisprTools
Suite of Functions for Pooled Crispr Screen QC and Analysis

R/RRAalpha.R
In gCrisprTools: Suite of Functions for Pooled Crispr Screen QC and Analysis

Defines functions ct.RRAalphaBatch ct.makeRhoNull ct.RRAaPvals ct.RRAalpha ct.alphaBeta ct.numcores

Documented in ct.alphaBeta ct.makeRhoNull ct.numcores ct.RRAalpha ct.RRAalphaBatch ct.RRAaPvals

Try the gCrisprTools package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

gCrisprTools Suite of Functions for Pooled Crispr Screen QC and Analysis

R/RRAalpha.R In gCrisprTools: Suite of Functions for Pooled Crispr Screen QC and Analysis

Defines functions ct.RRAalphaBatch ct.makeRhoNull ct.RRAaPvals ct.RRAalpha ct.alphaBeta ct.numcores

Documented in ct.alphaBeta ct.makeRhoNull ct.numcores ct.RRAalpha ct.RRAalphaBatch ct.RRAaPvals

Try the gCrisprTools package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

gCrisprTools
Suite of Functions for Pooled Crispr Screen QC and Analysis

R/RRAalpha.R
In gCrisprTools: Suite of Functions for Pooled Crispr Screen QC and Analysis