run_stego: Similarity Test for Genetic Outliers

Description Usage Arguments Value Author(s) Examples

Description

This function runs the analysis. A genotype matrix ([0,1] for phased, or [0,1,2] for unphased) is the only required argument.

Usage

1
2
3
4
run_stego(genotypes, phased = T, groups = "all.together",
  sampleNames = NULL, labels = NA, super = NA, minVariants = 5,
  blocksize = NA, simFun = NULL, saveDir = NA, verbose = F,
  cores = NULL)

Arguments

genotypes

data object containing the phased or unphased genotypes by samples

phased

logical defining whether data exists as phased data, as opposed to unphased data

groups

character specifying grouping of analysis. Default is to run analysis all at once- one of "all.together", "each.separately" or "pairwise.within.superpop"

sampleNames

character vector with unique identifiers for each sample

labels

character covariates, such as population membership. This is unused if groups is "all.together".

super

character covariates, such as population membership. This is used only if groups is "pairwise.within.superpop".

minVariants

integer specifing a minimum number of occurrences of the minor allele for the variant to be included in analysis. Default is 5, minimum allowed is 2.

blocksize

integer specifying the number of consecutive rows in the data matrix to be considered LD blocks. One variant will be chosen from each block in the analysis. Default is NA (no LD pruning, equivalent to blocksize=1)

simFun

function for similarity comparision, such as cor or cov. Default is null.

saveDir

file to save results output. Default is no saving (saveDir=NA)

verbose

logical indicating whether to output status updates during analysis run

Value

List with class "stego" containing

summary

Summary statistics, including p-values, FDR, kinship coefficient estimate between all pairs of individuals

s_matrix_dip

A matrix of pairwise s statistics between all individuals

s_matrix_hap

For phased data only, a matrix of pairwise s statistics between all haplotypes

var_s_dip

numeric estimate of the variance of pairwise subject test statistics

var_s_hap

numeric estimate of the variance of pairwise haploid test statistics. For phased data only

simMat

if simFun is used, A similarity matrix between all individuals

analysisType

character indicating what manner the subjects were grouped in the analysis

pkweightsMean

numeric value for the whole dataset as a function of the observed allele frequencies

Author(s)

Dan Schlauch dschlauch@fas.harvard.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data(toyGenotypes)
sampleNames <- paste("Sample",1:100)

res <- run_stego(toyGenotypes, sampleNames=sampleNames)
plot(res, plotname="All Samples")

labels <- paste("Group",c(LETTERS[rep(1:5,20)]))
res <- run_stegotoyGenotypes, groups="each.separately", labels=labels)
plot(res)

labels <- paste("Group",c(LETTERS[rep(1:5,10)],LETTERS[rep(6:10,10)]))
super <- c(rep("Super A",50), rep("Super B",50))
res <- run_stego(toyGenotypes, groups="pairwise.within.superpop", labels=labels, super=super)
plotFromGSM(res)

dschlauch/WESTGO documentation built on May 15, 2019, 2:58 p.m.