extract_afs_simple: Compute and store blocked allele frequency data
In uqrmaie1/admixtools: Inferring demographic history from genetic data

extract_afs_simple

R Documentation

Compute and store blocked allele frequency data

Description

Prepare data for various ADMIXTOOLS 2 functions. Reads data from packedancestrymap or PLINK files, and computes allele frequencies for selected populations and stores it as .rds files in outdir.

Usage

extract_afs_simple(
  pref,
  outdir,
  inds = NULL,
  pops = NULL,
  blgsize = 0.05,
  cols_per_chunk = 10,
  maxmiss = 0,
  minmaf = 0,
  maxmaf = 0.5,
  minac2 = FALSE,
  outpop = NULL,
  transitions = TRUE,
  transversions = TRUE,
  keepsnps = NULL,
  format = NULL,
  poly_only = FALSE,
  adjust_pseudohaploid = TRUE,
  verbose = TRUE
)

Arguments

`pref`	Prefix of PLINK/EIGENSTRAT/PACKEDANCESTRYMAP files. EIGENSTRAT/PACKEDANCESTRYMAP have to end in `.geno`, `.snp`, `.ind`, PLINK has to end in `.bed`, `.bim`, `.fam`
`outdir`	Directory where data will be stored.
`inds`	Individuals for which data should be extracted
`pops`	Populations for which data should be extracted. If both `pops` and `inds` are provided, they should have the same length and will be matched by position. If only `pops` is provided, all individuals from the `.ind` or `.fam` file in those populations will be extracted. If only `inds` is provided, each indivdual will be assigned to its own population of the same name. If neither `pops` nor `inds` is provided, all individuals and populations in the `.ind` or `.fam` file will be extracted.
`blgsize`	SNP block size in Morgan. Default is 0.05 (5 cM). If `blgsize` is 100 or greater, if will be interpreted as base pair distance rather than centimorgan distance.
`cols_per_chunk`	Number of populations per chunk. Lowering this number will lower the memory requirements when running `afs_to_f2`, but more chunk pairs will have to be computed.
`maxmiss`	Discard SNPs which are missing in a fraction of populations higher than `maxmiss`
`minmaf`	Discard SNPs with minor allele frequency less than `minmaf`
`maxmaf`	Discard SNPs with minor allele frequency greater than than `maxmaf`
`minac2`	Discard SNPs with allele count lower than 2 in any population (default `FALSE`). This option should be set to `TRUE` when computing f3-statistics where one population consists mostly of pseudohaploid samples. Otherwise heterozygosity estimates and thus f3-estimates can be biased. `minac2 == 2` will discard SNPs with allele count lower than 2 in any non-singleton population (this option is experimental and is based on the hypothesis that using SNPs with allele count lower than 2 only leads to biases in non-singleton populations). While the `minac2` option discards SNPs with allele count lower than 2 in any population, the `qp3pop` function will only discard SNPs with allele count lower than 2 in the first (target) population (when the first argument is the prefix of a genotype file).
`outpop`	Keep only SNPs which are heterozygous in this population
`transitions`	Set this to `FALSE` to exclude transition SNPs
`transversions`	Set this to `FALSE` to exclude transversion SNPs
`keepsnps`	SNP IDs of SNPs to keep. Overrides other SNP filtering options
`format`	Supply this if the prefix can refer to genotype data in different formats and you want to choose which one to read. Should be `plink` to read `.bed`, `.bim`, `.fam` files, or `eigenstrat`, or `packedancestrymap` to read `.geno`, `.snp`, `.ind` files.
`poly_only`	Specify whether SNPs with identical allele frequencies in every population should be discarded (`poly_only = TRUE`), or whether they should be used (`poly_only = FALSE`). By default (`poly_only = c("f2")`), these SNPs will be used to compute FST and allele frequency products, but not to compute f2 (this is the default option in the original ADMIXTOOLS).
`adjust_pseudohaploid`	Genotypes of pseudohaploid samples are usually coded as `0` or `2`, even though only one allele is observed. `adjust_pseudohaploid` ensures that the observed allele count increases only by `1` for each pseudohaploid sample. If `TRUE` (default), samples that don't have any genotypes coded as `1` among the first 1000 SNPs are automatically identified as pseudohaploid. This leads to slightly more accurate estimates of f-statistics. Setting this parameter to `FALSE` treats all samples as diploid and is equivalent to the ADMIXTOOLS `inbreed: NO` option. Setting `adjust_pseudohaploid` to an integer `n` will check the first `n` SNPs instead of the first 1000 SNPs.
`verbose`	Print progress updates

Value

SNP metadata (invisibly)

Examples

## Not run: 
pref = 'my/genofiles/prefix'
outdir = 'dir/for/afdata/'
extract_afs(pref, outdir)

## End(Not run)

uqrmaie1/admixtools documentation built on July 16, 2025, 4:01 p.m.

uqrmaie1/admixtools index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

uqrmaie1/admixtools
Inferring demographic history from genetic data

extract_afs_simple: Compute and store blocked allele frequency data
In uqrmaie1/admixtools: Inferring demographic history from genetic data

Compute and store blocked allele frequency data

Description

Usage

Arguments

Value

Examples

Related to extract_afs_simple in uqrmaie1/admixtools...

R Package Documentation

Browse R Packages

We want your feedback!

uqrmaie1/admixtools Inferring demographic history from genetic data

extract_afs_simple: Compute and store blocked allele frequency data In uqrmaie1/admixtools: Inferring demographic history from genetic data

Compute and store blocked allele frequency data

Description

Usage

Arguments

Value

Examples

Related to extract_afs_simple in uqrmaie1/admixtools...

R Package Documentation

Browse R Packages

We want your feedback!

uqrmaie1/admixtools
Inferring demographic history from genetic data

extract_afs_simple: Compute and store blocked allele frequency data
In uqrmaie1/admixtools: Inferring demographic history from genetic data