R/process.sample.contamination.checks.R

Defines functions process.sample.contamination.checks

Documented in process.sample.contamination.checks

#' Process sample contamination checks 
#' 
#' @description
#' Takes *selfSM reports generated by VerifyBamID during alignment, and returns a vector of freemix scores.
#' The freemix score is a sequence only estimate of sample contamination that ranges from 0 to 1.
#'
#' Note: Targeted panels are often too small for this step to work properly.
#' 
#' @inheritParams get.coverage.by.sample.statistics
#'
#' @return freemix.scores Data frame giving sample contamination (column freemix) score per sample.
#'
#' @references \url{https://genome.sph.umich.edu/wiki/VerifyBamID}
process.sample.contamination.checks <- function(project.directory) {
    
    sample.contamination.check.paths <- system.ls(pattern = "*/*selfSM", directory = project.directory, error = TRUE);	
    sample.ids <- extract.sample.ids(sample.contamination.check.paths, from.filename = TRUE);
    
    freemix.scores <- list();
    
    for(i in seq_along(sample.contamination.check.paths)) {
        
        path <- sample.contamination.check.paths[i];
        sample.id <- sample.ids[i];
        
        # Single row data frame, where header gives variable and the row gives value
        # The sample contamination score is stored in the column called FREEMIX.
        # For more information, see https://genome.sph.umich.edu/wiki/VerifyBamID#Column_information_in_the_output_files
        contamination.check <- utils::read.delim(
            path, 
            sep = "\t", 
            as.is = TRUE,
            header = TRUE,
            stringsAsFactors = FALSE
        );
        
        freemix.scores[[ sample.id ]] <- data.frame(
            "sample.id" = sample.id,		
            "freemix" = contamination.check[1, "FREEMIX"]
        );
    }
    
    freemix.scores <- do.call(rbind, freemix.scores);
    
    return(freemix.scores);
}

Try the varitas package in your browser

Any scripts or data that you put into this service are public.

varitas documentation built on Nov. 14, 2020, 1:07 a.m.