View source: R/read_vcf_parallel.R
read_vcf_parallel | R Documentation |
Read a VCF file across 1 or more threads in parallel.
If tilewidth
is not specified, the size of each chunk will be
determined by total genome size divided by ntile
.
By default, ntile
is equal to the number of threads, nThread
.
For further discussion on how this function was optimised,
see
here
and
here.
read_vcf_parallel(
path,
samples = 1,
which = NULL,
use_params = TRUE,
as_datatable = TRUE,
sampled_rows = 10000L,
include_xy = FALSE,
download = TRUE,
vcf_dir = tempdir(),
download_method = "download.file",
force_new = FALSE,
tilewidth = NULL,
mt_thresh = 100000L,
nThread = 1,
ntile = nThread,
verbose = TRUE
)
path |
Path to local or remote VCF file. |
samples |
Which samples to use:
|
which |
Genomic ranges to be added if supplied. Default is NULL. |
use_params |
When |
as_datatable |
Return the data as a
data.table (default: |
sampled_rows |
First N rows to sample.
Set |
download |
Download the VCF (and its index file)
to a temp folder before reading it into R.
This is important to keep |
vcf_dir |
Where to download the original VCF from Open GWAS.
WARNING: This is set to |
download_method |
|
force_new |
If a formatted file of the same names as |
tilewidth |
The desired tile width. The effective tile width might be slightly different but is guaranteed to never be more than the desired width. |
mt_thresh |
When the number of rows (variants) in the VCF is
|
nThread |
Number of threads to use for parallel processes. |
ntile |
The number of tiles to generate. |
verbose |
Print messages. |
VCF file.
path <- "https://gwas.mrcieu.ac.uk/files/ieu-a-298/ieu-a-298.vcf.gz"
#### Single-threaded ####
vcf <- MungeSumstats:::read_vcf_parallel(path = path)
#### Parallel ####
vcf2 <- MungeSumstats:::read_vcf_parallel(path = path, nThread=11)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.