read_vcf | R Documentation |
Read in a VCF file as a VCF or a data.table. Can optionally save the VCF/data.table as well.
read_vcf(
path,
as_datatable = TRUE,
save_path = NULL,
tabix_index = FALSE,
samples = 1,
which = NULL,
use_params = TRUE,
sampled_rows = 10000L,
download = TRUE,
vcf_dir = tempdir(),
download_method = "download.file",
force_new = FALSE,
mt_thresh = 100000L,
nThread = 1,
verbose = TRUE
)
path |
Path to local or remote VCF file. |
as_datatable |
Return the data as a
data.table (default: |
save_path |
File path to save formatted data. Defaults to
|
tabix_index |
Index the formatted summary statistics with tabix for fast querying. |
samples |
Which samples to use:
|
which |
Genomic ranges to be added if supplied. Default is NULL. |
use_params |
When |
sampled_rows |
First N rows to sample.
Set |
download |
Download the VCF (and its index file)
to a temp folder before reading it into R.
This is important to keep |
vcf_dir |
Where to download the original VCF from Open GWAS.
WARNING: This is set to |
download_method |
|
force_new |
If a formatted file of the same names as |
mt_thresh |
When the number of rows (variants) in the VCF is
|
nThread |
Number of threads to use for parallel processes. |
verbose |
Print messages. |
The VCF file in data.table format.
#### Benchmarking ####
library(VCFWrenchR)
library(VariantAnnotation)
path <- "https://gwas.mrcieu.ac.uk/files/ubm-a-2929/ubm-a-2929.vcf.gz"
vcf <- VariantAnnotation::readVcf(file = path)
N <- 1e5
vcf_sub <- vcf[1:N,]
res <- microbenchmark::microbenchmark(
"vcf2df"={dat1 <- MungeSumstats:::vcf2df(vcf = vcf_sub)},
"VCFWrenchR"= {dat2 <- as.data.frame(x = vcf_sub)},
"VRanges"={dat3 <- data.table::as.data.table(
methods::as(vcf_sub, "VRanges"))},
times=1
)
Discussion on VariantAnnotation GitHub
Discussion on VariantAnnotation GitHub
#### Local file ####
path <- system.file("extdata","ALSvcf.vcf", package="MungeSumstats")
sumstats_dt <- read_vcf(path = path)
#### Remote file ####
## Small GWAS (0.2Mb)
# path <- "https://gwas.mrcieu.ac.uk/files/ieu-a-298/ieu-a-298.vcf.gz"
# sumstats_dt2 <- read_vcf(path = path)
## Large GWAS (250Mb)
# path <- "https://gwas.mrcieu.ac.uk/files/ubm-a-2929/ubm-a-2929.vcf.gz"
# sumstats_dt3 <- read_vcf(path = path, nThread=11)
### Very large GWAS (500Mb)
# path <- "https://gwas.mrcieu.ac.uk/files/ieu-a-1124/ieu-a-1124.vcf.gz"
# sumstats_dt4 <- read_vcf(path = path, nThread=11)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.