dive_effects2mash: Wrapper to run mash given a phenotype data frame

Description Usage Arguments Value

View source: R/dive_effects2mash.R

Description

Though step-by-step GWAS, preparation of mash inputs, and mash allows you the most flexibility and opportunities to check your results for errors, once those sanity checks are complete, this function allows you to go from a phenotype data.frame of a few phenotypes you want to compare to a mash result. Some exception handling has been built into this function, but the user should stay cautious and skeptical of any results that seem 'too good to be true'.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
dive_effects2mash(
  effects,
  snp,
  metadata,
  phe = c(1:nrow(metadata)),
  suffix = "",
  outputdir = ".",
  ncores = NA,
  thr.r2 = 0.2,
  thr.m = c("max", "sum"),
  num.strong = 1000,
  num.random = NA,
  scale.phe = TRUE,
  U.ed = NA,
  U.hyp = NA,
  verbose = TRUE
)

Arguments

effects

fbm created using 'dive_phe2effects' or 'dive_phe2mash'. Saved under the name "gwas_effects_suffix.rds" and can be loaded into R using the bigstatsr function "big_attach".

snp

A "bigSNP" object; load with snp_attach().

metadata

Metadata created using 'dive_phe2effects' or 'dive_phe2mash'. Saved under the name "gwas_effects_suffix_associated_metadata.csv".

phe

Optional numeric vector of phenotypes to include in the mash run. Default is to include all phenotypes specified in effects and metadata. Specify this by specifying the row numbers of metadata that you would like to keep.

suffix

Optional character vector to give saved files a unique search string/name.

outputdir

Optional file path to save output files.

ncores

Optional integer to specify the number of cores to be used for parallelization. You can specify this with bigparallelr::nb_cores().

thr.r2

Value between 0 and 1. Threshold of r2 measure of linkage disequilibrium. Markers in higher LD than this will be subset using clumping.

thr.m

"sum" or "max". Type of threshold to use to clump values for mash inputs. "sum" sums the -log10pvalues for each phenotype and uses the maximum of this value as the threshold. "max" uses the maximum -log10pvalue for each SNP across all of the univariate GWAS.

num.strong

Integer. Number of SNPs used to derive data-driven covariance matrix patterns, using markers with strong effects on phenotypes.

num.random

Integer. Number of SNPs used to derive the correlation structure of the null tests, and the mash fit on the null tests.

scale.phe

Logical. Should effects for each phenotype be scaled to fall between -1 and 1? Default is TRUE.

U.ed

Mash data-driven covariance matrices. Specify these as a list or a path to a file saved as an .rds. Creating these can be time-consuming, and generating these once and reusing them for multiple mash runs can save time.

U.hyp

Other covariance matrices for mash. Specify these as a list. These matrices must have dimensions that match the number of phenotypes where univariate GWAS ran successfully.

verbose

Output some information on the iterations? Default is TRUE.

Value

A mash object made up of all phenotypes where univariate GWAS ran successfully.


Alice-MacQueen/snpdiver documentation built on Dec. 17, 2021, 8:41 a.m.