Description Details Author(s) References Examples
WhopGenome provides read access to Variant Call Format files with maximum speed by means of C functions with many specialised output formats and a configurable filtering engine. Allows indexing of FASTA files and any file format using tab-separated columns, such as GFF, VCF and METAL, in preparation to high-speed access. Can read specified subsections of indexed FASTA files very fast. It also provides many easy-to-use methods to access the UCSC Genome Browser SQL servers, the AmiGO gene ontology databases, PLINK .PED files and Bioconductor's organism annotation databases.
Package: | WhopGenome |
Type: | Package |
Version: | 1.0 |
Date: | 2013-01-24 |
License: | GPL-2 |
- Open a VCF file with handle <- vcf_open("filename") - Set a region of interest (chromosome/contig ID,start position, end position) with vcf_setregion(handle,"X",200000, 300000 ) - Select (in this case the first 10) samples of interest: vcf_selectsamples( handle, vcf_getSampleNames(handle)[1:10] ) - Read from the file via resvec <- vcf_readLineVec(handle)
Ulrich Wittelsbuerger ulrich.wittelsbuerger@uni-duesseldorf.de
The 1000 Genomes Project http://1000genomes.org/
The 1000 Genomes Project Consortium (2010), A map of human genome variation from population-scale sequencing. Nature *467*, 1061-1073.
Heng Li (2011), Tabix: Fast retrieval of sequence features from generic TAB-delimited files, Bioinformatics, doi: 10.1093/bioinformatics/btq671
The Variant Call Format http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
1 | #vcfh <- .Call("VCF_open","/data/vcf/1000g/ALL.Chromosome1.consensus.vcf.gz",PACKAGE="WhopGenome")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.