extract_snps_from_bgzip: Extract a subset of a block-gzipped (and indexed) file by CHR...

Description Usage Arguments Value Examples

View source: R/variant_extraction.R

Description

it is expected that snps has two columns 'CHR' and 'POS' indicating the chromosome and the (1-based) position of the snp in question. It is assumed that 'tabix' is installed on the system and directly available for use by a system call.

Usage

1
2
3
4
5
6
7
8
extract_snps_from_bgzip(
  outcome,
  snps,
  mapping_function = identity,
  comment_char = "",
  validate = TRUE,
  chr_action = "both"
)

Arguments

outcome

gzipped file containing genomic positions and data. should be already indexed by tabix

snps

a dataframe containing at least the columns 'CHR' and 'POS'. These position will be queried in the outcome

mapping_function

A function that will map the gwas as it is read from disk to a data-frame that will be sanity-checked as being a gwas.

comment_char

The character that is the comment character in the file.

validate

a boolean indicating whether to validate the resulting gwas (after applying the mapping function) (TRUE by default) use FALSE for debugging.

chr_action

indicates whether to add a "chr" prefix (if not present) to the names of the contigs in the query SNPs, remove it (where present), leave as is, or try to do both (default). "both", "leave","remove","add".

Value

a gwas containing the subset of the outcome file that was requested. The first non-commented line will be assumed to be the header line. It will be retained and used as the column names for the resulting sub-gwas.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
partial_gwas <- system.file("extdata", "COVID_partial_gwas_hg37.txt.gz", package = "MRutils")
system(glue::glue("tabix -s 1 -b 2 -e 2 -f '{partial_gwas}'"))

mapping_function <- function(x) {
   dplyr::mutate(x,
      CHR=as.character(X.CHR),
      beta = all_inv_var_meta_beta, 
      P = all_inv_var_meta_p, 
      EA = ALT, 
      NEA = REF, 
      EAF = all_meta_AF, 
      SE = all_inv_var_meta_sebeta) %>% subset(select = required_headers)
}
extract_snps_from_bgzip(partial_gwas, demo_data, mapping_function=mapping_function, chr_action="remove")

richardslab/MRutils documentation built on Dec. 22, 2021, 4 p.m.