translate_header: Translate column names into standard names

View source: R/translate_header.R

translate_headerR Documentation

Translate column names into standard names

Description

This function is used to translate non-standard column names into the standard ones used by QC_GWAS and other functions. It can also translate the standard names into user-specified names (via the out_header argument of QC_GWAS).

Usage

translate_header(header,
    standard = c("MARKER", "CHR", "POSITION", "EFFECT_ALL",
                 "OTHER_ALL", "STRAND", "EFFECT", "STDERR",
                 "PVALUE", "EFF_ALL_FREQ", "HWE_PVAL",
                 "CALLRATE", "N_TOTAL", "IMPUTED",
                 "USED_FOR_IMP", "IMP_QUALITY"),
    alternative)

Arguments

header

character vector; the header to be translated.

standard

character vector; the names header should be translated into.

alternative

translation table; see 'Details' for more information.

Details

In a nutshell: the header argument contains the names you have; the standard argument contains the names you want; and alternative is the conversion table.

The table in alternative should have two columns. The left column contains the standard names; the right column possible alternatives. Only one alternative name should be listed per row. translate_header automatically changes the contents of header to uppercase, so standard and the right column of alternative should be uppercase as well.

A sample translation table is provided in the package data folder. It can be loaded via data("header_translations"). An editable .txt version can be found in the "R\library\QCGWAS\doc" folder.

Value

translate_header returns an object of class 'list' with 6 components:

header_h

character vector; the translated header. Unknown columns are included under their old names.

missing_h

character vector; the standard column names that were not found. If none, this returns NULL.

unknown_h

character vector; column names that could not be converted to a standard name. Note that these columns are also included in header_h. If none, this returns NULL.

header_N, missing_N, unknown_N

integer; the lengths of the above three vectors

See Also

header_translations for a sample translation table.

identify_column

Examples

  sample_data <-
    data.frame(SNP = paste("rs", 1:10, sep = ""),
               chrom = 2,
               effect = 1:10/10,
               misc = NA,
               stringsAsFactors = FALSE)
  # Creates a table with four columns:
  #   SNP, chrom, effect and misc.

  ( alt_headers <-
      data.frame(
        standard = c("MARKER", "MARKER", "CHR", "CHR"),
        alternative = c("MARKER", "SNP", "CHR", "CHROM"),
        stringsAsFactors = FALSE) )
  # Creates the translation table
  #  with the standard names in column 1 and the alternatives
  #  in column 2.
  
  ( header_info <- 
      translate_header(header = names(sample_data),
        standard = c("MARKER", "CHR", "EFFECT"),
        alternative = alt_headers)  )
        
  # Despite not being in the translation table, EFFECT is
  #  changed to uppercase because it is present in standard.
  #  misc is neither in standard or the translation table, so
  #  it is marked as unknown and left unchanged.

  names(sample_data) <- header_info$header_h

QCGWAS documentation built on May 30, 2022, 5:05 p.m.