View source: R/read_vcf_parallel.R
| read_vcf_parallel | R Documentation |
Read a VCF file across 1 or more threads in parallel.
If tilewidth is not specified, the size of each chunk will be
determined by total genome size divided by ntile.
By default, ntile is equal to the number of threads, nThread.
For further discussion on how this function was optimised,
see
here
and
here.
read_vcf_parallel(
path,
samples = 1,
which = NULL,
use_params = TRUE,
as_datatable = TRUE,
sampled_rows = 10000L,
include_xy = FALSE,
download = TRUE,
vcf_dir = tempdir(),
download_method = "download.file",
force_new = FALSE,
tilewidth = NULL,
mt_thresh = 100000L,
nThread = 1,
ntile = nThread,
verbose = TRUE
)
path |
Path to local or remote VCF file. |
samples |
Which samples to use:
|
which |
Genomic ranges to be added if supplied. Default is NULL. |
use_params |
When |
as_datatable |
Return the data as a
data.table (default: |
sampled_rows |
First N rows to sample.
Set |
download |
Download the VCF (and its index file)
to a temp folder before reading it into R.
This is important to keep |
vcf_dir |
Where to download the original VCF from Open GWAS.
WARNING: This is set to |
download_method |
|
force_new |
If a formatted file of the same names as |
tilewidth |
The desired tile width. The effective tile width might be slightly different but is guaranteed to never be more than the desired width. |
mt_thresh |
When the number of rows (variants) in the VCF is
|
nThread |
Number of threads to use for parallel processes. |
ntile |
The number of tiles to generate. |
verbose |
Print messages. |
VCF file.
path <- "https://gwas.mrcieu.ac.uk/files/ieu-a-298/ieu-a-298.vcf.gz"
#### Single-threaded ####
vcf <- MungeSumstats:::read_vcf_parallel(path = path)
#### Parallel ####
vcf2 <- MungeSumstats:::read_vcf_parallel(path = path, nThread=11)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.