f4blockdat_from_geno: f4 from genotype data

f4blockdat_from_genoR Documentation

f4 from genotype data

Description

Compute per-block f4-statistics directly from genotype data

Usage

f4blockdat_from_geno(
  pref,
  popcombs = NULL,
  left = NULL,
  right = NULL,
  auto_only = TRUE,
  blgsize = 0.05,
  block_lengths = NULL,
  f4mode = TRUE,
  allsnps = FALSE,
  poly_only = FALSE,
  snpwt = NULL,
  keepsnps = NULL,
  verbose = TRUE
)

Arguments

pref

Prefix of genotype files

popcombs

A data frame with one population combination per row, and columns pop1, pop2, pop3, pop4. If there is an additional integer column named model and allsnps = FALSE, only SNPs present in every population in any given model will be used to compute f4-statistics for that model.

left

Populations on the left side of f4 (pop1 and pop2). Can be provided together with right in place of popcombs.

right

Populations on the right side of f4 (pop3 and pop4). Can be provided together with left in place of popcombs.

auto_only

Use only chromosomes 1 to 22.

blgsize

SNP block size in Morgan. Default is 0.05 (5 cM). If blgsize is 100 or greater, if will be interpreted as base pair distance rather than centimorgan distance.

block_lengths

An optional vector with block lengths. If NULL, block lengths will be computed.

f4mode

If TRUE: f4 is computed from allele frequencies a, b, c, and d as (a-b)*(c-d). if FALSE, D-statistics are computed instead, defined as (a-b)*(c-d) / ((a + b - 2*a*b) * (c + d - 2*c*d)), which is the same as (P(ABBA) - P(BABA)) / (P(ABBA) + P(BABA)).

allsnps

Use all SNPs with allele frequency estimates in every population of any given population quadruple. If FALSE (the default) only SNPs which are present in all populations in popcombs (or any given model in it) will be used. Setting allsnps = TRUE in the presence of large amounts of missing data might lead to false positive results.

poly_only

Only keep SNPs with mean allele frequency not equal to 0 or 1.

snpwt

A vector of SNP weights

keepsnps

A vector of SNP IDs to keep

verbose

Print progress updates

Value

A data frame with per-block f4-statistics for each population quadruple.


uqrmaie1/admixtools documentation built on March 20, 2024, 8:24 a.m.