pre_allocate: Pre-allocate a list of overlapping SNP windows

View source: R/pre_allocate.R

pre_allocateR Documentation

Pre-allocate a list of overlapping SNP windows

Description

SNP data is converted into overlapping windows as specified. This data structure preparation is useful for parallelization of SKAT.

Usage

pre_allocate(
  raw_file_path,
  window_size,
  window_shift,
  pre_allocated_dir,
  impute_to_mean = TRUE,
  remove_novar_SNPs = TRUE,
  missing_cutoff = 0.15
)

Arguments

raw_file_path

complete file path to SNP data in '.traw' format (see PLINK documentation)

window_size

An integer, indicating the size of each SNP window (in base pairs)

window_shift

An integer, indicating the number of base pairs over which each rolling window will slide; in other terms, the distance between the start (or end) positions of adjacent overlapping windows

pre_allocated_dir

a directory where pre-allocated SNP window lists are kept

impute_to_mean

If 'TRUE', NA values for each SNP are replaced with the mean alternative allele count for the given SNP

remove_novar_SNPs

If 'TRUE', SNPs with no variation will be removed

missing_cutoff

A numeric threshold representing the minimum desired missing rate; missing rate is defined for each SNP as the proportion of genotypes missing data for the given SNP. Imputation to mean is performed , either by 'pre_allocate' or 'SKAT' itself, for all remaining missing values

Value

a list of lists, with each sub-list containing elements as described in documentation for extract_window

Examples


## Not run: 
raw_file_path <- system.file("extdata",
  "poplar_SNPs_Chr10_14460to14550kb.traw",
  package = "SKATMCMT")

pre_allocate(pre_allocated_dir = tempdir(),
  raw_file_path = raw_file_path,
  window_size = 3000,
  window_shift = 1000)

## End(Not run)


naglemi/mtmcskat documentation built on Aug. 23, 2023, 5:35 p.m.