extract_f2_large | R Documentation |
extract_f2_large
does the same as extract_f2
, but it requires less memory and is slower. outdir
has to be set in extract_f2_large
.
extract_f2_large(
pref,
outdir,
inds = NULL,
pops = NULL,
blgsize = 0.05,
cols_per_chunk = 10,
maxmiss = 0,
minmaf = 0,
maxmaf = 0.5,
minac2 = FALSE,
outpop = NULL,
outpop_scale = TRUE,
transitions = TRUE,
transversions = TRUE,
keepsnps = NULL,
snpblocks = NULL,
overwrite = FALSE,
format = NULL,
adjust_pseudohaploid = TRUE,
afprod = TRUE,
fst = TRUE,
poly_only = c("f2"),
apply_corr = TRUE,
verbose = TRUE
)
pref |
Prefix of PLINK/EIGENSTRAT/PACKEDANCESTRYMAP files.
EIGENSTRAT/PACKEDANCESTRYMAP have to end in |
outdir |
Directory where data will be stored. |
inds |
Individuals for which data should be extracted |
pops |
Populations for which data should be extracted. If both |
blgsize |
SNP block size in Morgan. Default is 0.05 (5 cM). If |
cols_per_chunk |
Number of populations per chunk. Lowering this number will lower the memory requirements when running |
maxmiss |
Discard SNPs which are missing in a fraction of populations higher than |
minmaf |
Discard SNPs with minor allele frequency less than |
maxmaf |
Discard SNPs with minor allele frequency greater than than |
minac2 |
Discard SNPs with allele count lower than 2 in any population (default |
outpop |
Keep only SNPs which are heterozygous in this population |
outpop_scale |
Scale f2-statistics by the inverse |
transitions |
Set this to |
transversions |
Set this to |
keepsnps |
SNP IDs of SNPs to keep. Overrides other SNP filtering options |
overwrite |
Overwrite existing files in |
format |
Supply this if the prefix can refer to genotype data in different formats
and you want to choose which one to read. Should be |
adjust_pseudohaploid |
Genotypes of pseudohaploid samples are usually coded as |
afprod |
Write files with allele frequency products for every population pair. Setting this to FALSE can make |
fst |
Write files with pairwise FST for every population pair. Setting this to FALSE can make |
poly_only |
Specify whether SNPs with identical allele frequencies in every population should be discarded ( |
apply_corr |
Apply small-sample-size correction when computing f2-statistics (default |
verbose |
Print progress updates |
extract_f2_large
requires less memory because it writes allele frequency data to disk, and doesn't store the allele frequency matrix for all populations and SNPs in memory. If you still run out of memory, reduce cols_per_chunk
. This function is a wrapper around extract_afs
and afs_to_f2
, and is slower than extract_f2
. It may be faster to call extract_afs
and afs_to_f2
directly, parallelizing over the different calls to afs_to_f2
.
SNP metadata (invisibly)
extract_f2
## Not run:
pref = 'my/genofiles/prefix'
f2dir = 'my/f2dir/'
extract_f2_large(pref, f2dir, pops = c('popA', 'popB', 'popC'))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.