pairscan_null: Generate a null distribution for the pairscan.

View source: R/pairscan_null.R

pairscan_nullR Documentation

Generate a null distribution for the pairscan.

Description

This script generates a null distribution for the pairscan. For each permutation, it runs a single scan and selects the top N markers. It then uses these markers to perform a permutation of the pairscan. the null distribution generated here uses a fixed number of the TOP ranking markers from the permuted single scan we need to update it to use select_markers_for_pairscan if marker_selection_method is netwas, you need to provide a list of genes from the netWAS analysis

Usage

pairscan_null(
  data_obj,
  geno_obj = NULL,
  scan_what = c("eigentraits", "raw_traits"),
  pairscan_null_size = NULL,
  max_pair_cor = NULL,
  min_per_geno = NULL,
  model_family = "gaussian",
  marker_selection_method = c("top_effects", "uniform", "effects_dist", "by_gene"),
  run_parallel = FALSE,
  n_cores = 4,
  verbose = FALSE
)

Arguments

data_obj

a Cape object

geno_obj

a genotype object

scan_what

A character string uniquely identifying whether eigentraits or raw traits should be scanned. Options are "eigentraits", "raw_traits"

pairscan_null_size

The total size of the null distribution. This is DIFFERENT than the number of permutations to run. Each permutation generates n choose 2 elements for the pairscan. So for example, a permutation that tests 100 pairs of markers will generate a null distribution of size 4950. This process is repeated until the total null size is reached. If the null size is set to 5000, two permutations of 100 markers would be done to get to a null distribution size of 5000.

max_pair_cor

A numeric value between 0 and 1 indicating the maximum Pearson correlation that two markers are allowed. If the correlation between a pair of markers exceeds this threshold, the pair is not tested. If this value is set to NULL, min_per_genotype must have a numeric value.

min_per_geno

The minimum number of individuals allowable per genotype. If for a given marker pair, one of the genotypes is underrepresented, the marker pair is not tested. If this value is NULL, max_pair_cor must have a numeric value.

model_family

Indicates the model family of the phenotypes. This can be either "gaussian" or "binomial".

marker_selection_method

options are "top_effects", "uniform", "effects_dist", "by_gene"

run_parallel

Whether to run the analysis on multiple CPUs

n_cores

The number of CPUs to use if run_parallel is TRUE

verbose

Whether to write progress to the screen


cape documentation built on May 29, 2024, 5:11 a.m.