Description Usage Arguments Value Author(s) Examples
Reads files containing single nucleotide variants (SNV) and structural genomic variants(SV) - vcf.gz files generated by speedseq aligner and variant caller. Function outputs visualization png figures. Figure illustrates variants (blue dots) in their genomic coordinates (x axis). Ratio of alternative reads and depth (y axis) gives information about type of variant: homozygous alternative (expected ratio 1) and heterozygous (expected ratio 0.5). Green dots represent rare variants that pass filters: coding/UTR, nonsynonymous variant with dbSNP frequency < 0.01 and ExAC frequency < 0.01. Orange vertical lines depict position of centromere. Orange dots depict structural and copy number variants that overlap with coding region and are relatively good quality (QUAL > 0). Red curve illustrates moving average of alternative reads/depth ratio. High values of this curve (exceeding 0.75) can suggest potential homozygous/deleterious regions. In addition, files containing table with rare SNV and SV variants only are generated. Tables include variants that passed filters specified above with annotations (uniprot, RefSeq and other). Function analyzes whole genome in about 30 minutes on a desktop computer.
1 | chromosomeVis(sample, sv_sample, dbSNP_file, Exac_file, chromosomes, pngWidth, pngHeight, caller, MA_Window, coding_regions_file, annotation_file, uniprot_file)
|
sample |
A name of SNV sample file to be analyzed. |
sv_sample |
A name of additional SV sample file. If not specified, structural variants are discarded. |
dbSNP_file |
A file with SNPs database. If not specified, chromosome 19 dbSNP is used. |
Exac_file |
ExAC database file. If not specified, chromosome 19 ExAC is used. |
chromosomes |
A vector of strings indicating chromosomes to be analyzed. |
pngWidth |
A number indicating pixel width of output png files. Default is 1600. |
pngHeight |
A number indicating pixel height of output png files. Default is 1200. |
caller |
A string indicating vcf caller. Default is "speedseq", supports "GATK" |
MA_Window |
A number indicating window size for moving average function. Recommended value for genome is 2000, for exome is 20. Default is 1000. |
coding_regions_file |
A bed file indicating coding regions |
annotation_file |
Text file indicating positions of the genes (from UCSC) |
uniprot_file |
Text file indicating gene functions and related diseases (from Uniprot) |
comp1 |
function plots static visualization of genomic variants on all chromosomes, annotates them, filters and reports output variants in tables |
Adam Gudys and Tomasz Stokowy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | # analyze chromosome 19 from example genome
sample = system.file("extdata", "CoriellIndex_S1_chr19_9-10_S1.vcf.recode.vcf.gz",
package = "RareVariantVis")
sv_sample = system.file("extdata", "CoriellIndex_S1.sv.vcf.gz",
package = "RareVariantVis")
chromosomeVis(sample=sample, sv_sample=sv_sample, chromosomes=c("19"))
# without sv data
# sample = system.file("extdata", "CoriellIndex_S1_chr19_9-10_S1.vcf.recode.vcf.gz",
# package = "RareVariantVis")
# chromosomeVis(sample=sample, chromosomes=c("19"))
# analyze entire genome (use external full-genome dbSNP and ExAC)
# it takes approximately 30 mins on a desktop computer
# large example data and all necessary hg19 references can be downloaded from:
# https://github.com/agudys/DataRareVariantVis
# dbSNP_file = "All_20160601.vcf.gz"
# Exac_file = "ExAC.r0.3.1.sites.vep.vcf.gz"
# chromosomeVis(sample=sample, sv_sample=sv_sample,
# dbSNP_file=dbSNP_file, Exac_file=Exac_file,
# chromosomes=c(as.character(1:22), "X", "Y"), MA_Window = 2000,
# coding_regions_file = "nexterarapidcapture_exome_targetedregions_v1.2.bed",
# annotation_file = "UCSC_hg19_refSeq_160702.txt",
# uniprot_file = "uniprot-all.txt")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.