OneDimSFS | R Documentation |
This function calculates a one-dimensional site frequency spectrum from a VCF file. It processes the file in batches for efficient memory usage. The user can decide between a folded or unfolded spectrum.
OneDimSFS(
vcf_path,
folded = FALSE,
batch_size = 10000,
threads = 1,
write_log = FALSE,
logfile = "log.txt",
exclude_ind = NULL
)
vcf_path |
Path to the VCF file. |
folded |
Logical, deciding if folded (TRUE) or unfolded (FALSE) SFS is returned. |
batch_size |
The number of variants to be processed in each batch (default of 10,000 should be suitable for most use cases). |
threads |
Number of threads to use for parallel processing. |
write_log |
Logical, indicating whether to write progress logs. |
logfile |
Path to the log file where progress will be logged. |
exclude_ind |
Optional vector of individual IDs to exclude from the analysis. If provided, the function will remove these individuals from the genotype matrix before applying the custom function. Default is NULL, meaning no individuals are excluded. |
Site frequency spectrum as a named vector
vcf_file <- system.file("tests/testthat/sim.vcf.gz", package = "GenoPop")
index_file <- system.file("tests/testthat/sim.vcf.gz.tbi", package = "GenoPop")
sfs <- OneDimSFS(vcf_file, folded = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.