R/read_gemma_assoc.R

Defines functions read_gemma_assoc

Documented in read_gemma_assoc

#' Read GEMMA statistics table
#'
#' Parses the output generated by GEMMA (genetic assocation test).
#' Returned table has column names standardized for convenience.
#' Validated with GEMMA version 0.98.1.
#'
#' @param file The file path to read, possibly missing the extension.
#' @param ext The expected file extension.
#' Set to `NA` to prevent adding any extension to `file`.
#' The default "assoc.txt" extension matches what GEMMA generates.
#' @param verbose If `TRUE` (default) function reports the path of the file being loaded (after autocompleting the extensions).
#'
#' @return The table as a `tibble`.
#' Original names (as they appear in the header line of the file) are modified by these specific mappings to ensure that columns shared by BIM table have the same names as those returned by [genio::read_bim()], the association p-value is "p", and the regression coefficient is "beta" ("orig" -> "new"):
#' - "rs" -> "id"
#' - "ps" -> "pos"
#' - "allele1" -> "alt"
#' - "allele0" -> "ref"
#' - "p_wald" -> "p"
#'
#' @examples 
#' \dontrun{
#' data <- read_gemma_assoc( file )
#' }
#'
#' @seealso
#' [gemma_lmm()] wrapper for executing GEMMA and reading the results.
#' 
#' @export
read_gemma_assoc <- function( file, ext = 'assoc.txt', verbose = TRUE ) {
    if ( missing( file ) )
        stop( '`file` is required!' )

    # add .ext and/or .gz if missing and needed
    file <- genio:::add_ext_read(file, ext)
    
    # announce it if needed
    if (verbose)
        message('Reading: ', file)

    # read table!
    data <- readr::read_tsv(
                       file,
                       col_types = 'cciiccdddddd'
                   )

    # normalize column names
    names( data )[ names( data ) == 'rs' ] <- 'id'
    names( data )[ names( data ) == 'ps' ] <- 'pos'
    names( data )[ names( data ) == 'allele1' ] <- 'alt'
    names( data )[ names( data ) == 'allele0' ] <- 'ref'
    names( data )[ names( data ) == 'p_wald' ] <- 'p'
    
    # return tibble
    return( data )
}
OchoaLab/genbin documentation built on Nov. 14, 2024, 7:33 p.m.