oldRuntimeOptions | R Documentation |
Runtime options for the most current API version of the Ensembl Variant Effect Predictor.
VEPParam
objects store the runtime options for querying the Ensembl
Variant Effect Predictor (VEP). This page describes only the most current
runtime options and is a condensed version of what is listed on the
Ensembl web site:
http://uswest.ensembl.org/info/docs/tools/vep/script/vep_options.html
Runtime options for archived versions can be found on the corresponding archive page.
http://useast.ensembl.org/info/website/archives/index.html
Data in the VEPParam
are organized into the following categories,
‘basic’, ‘input’, ‘cache’, ‘output’,
‘identifier’, ‘colocatedVariants’, ‘dataformat’,
‘filterqc’, ‘database’ and ‘advanced’. Each category
is a list
of runtime options. logical
options are turned
on/off with TRUE/FALSE. character
and numeric
are
‘on’ when a character string is provided and ‘off’ when
they contain an empty value (i.e., character()
or numeric()
.
‘identifier’, ‘colocatedVariants’, ‘dataformat’ are supported for VEPParam73 and later.
basic
list
of the following options:
verbose: logical
, default FALSE; output status messages
quiet: logical
, default FALSE; suppress status/warnings
no_progress: logical
, default FALSE; don't show progress
bars
config: character
, default character()
; name of
config file
everything: logical
, default FALSE; shortcut to switch
on 12 options (sift, polyphen, ccds, hgvs, hgnc, numbers, domains,
regulatory, cell_type, canonical, protein and gmaf).
fork: numeric
, default numeric()
; enable forking
input
list
of the the following options:
species: character
, default 'homo_sapiens';
species for the data
assembly: character
, default character()
;
select assembly version if more than one available
format: character
, default character()
;
one of the following input file formats, 'ensembl', 'vcf',
'pileup', 'hgvs', 'id' or 'vep'. By default the script
auto-detects the input file format.
output_file: character
, default writes to temp file;
path and file name of output file
force_overwrite: logical
, default FALSE; overwrite
the output file if it currently exists
stats_file: character
, default character()
;
summary stats file name
no_stats: logical
, default FALSE; do not generate
a stats file
stats_text: logical
, default FALSE; generate a plain
text stats file instead of html
html: logical
, default FALSE; generate html version
of the output file
cache
list
of the following options:
cache: logical
, default FALSE; enable use of cache
dir: character
, default '$HOME/.vep/'; cache/plugin
to be used
dir_cache: character
, default '$HOME/.vep/'; cache
to be used
dir_plugins: character
, default '$HOME/.vep/'; plugin
to be used
offline: logical
, default FALSE; enable offline mode,
no database connections will be made
fasta: character
, default character()
; FASTA
filename or directory to files to use for reference sequences
cache_version: character
, default character()
;
use a different cache version than the assumed default
show_cache_info: logical
, default FALSE;
show source version information for selected cache and quit
output
list
of the following options:
variant_class: logical
, default FALSE;
output the sequence ontology variant class
sift: character
, default character()
;
output prediction, score
or both, valid strings are 'p', 's' or 'b'
polyphen: character
, default character()
;
output prediction,
score or both, valid strings are 'p', 's' or 'b'
humdiv: logical
, default FALSE;
retrieve the humDiv PolyPhen prediction instead of humVar
gene_phenotype: logical
, default FALSE;
indicates if overlapped gene is associated with a phenotype, disease
or trait
regulatory: logical
, default FALSE; identify overlaps
with regulatory regions
cell_type: character
, default character()
;
only report
regulatory regions found in the given cell type(s)
custom: character
, default character()
; name of
custom annotation file to add to output. Currently only a single
annotation is supported.
plugin: character
, default character()
; name of
plugin module. Currently only a single module is supported.
individual: character
, default character()
;
consider only alternate alleles present in the genotypes of
'all' or a character vector of specified individuals
phased: logical
, default FALSE; force VCF genotypes
to be interpreted as phased
allele_number: logical
, default FALSE; identify allele
number from VCF input (1=first ALT, 2=second ALT, etc.)
total_length: character
, default character()
;
cDNA, CDS and protein positions as position/length
numbers: logical
, default FALSE; output affectd exon and
intron numbering, format is Number/Total
domains: logical
, default FALSE; output names of
overlapping protein domains
no_escape: logical
, default FALSE;
don't URI escape HGVS string
keep_csq: logical
, default FALSE;
don't overwrite existing CSQ entry in VCF INFO field
vcf_info_field: character
, default CSQ;
change the name of the INFO key that VEP writes the consequences to
in the VCF output.
terms: character
, default 'so'; type
of consequence terms to output, valid strings are 'ensembl' or 'so'
identifiers
list
of the following options:
hgvs: logical
, default FALSE; add hgvs ID's
shift_hgvs: [0/1]
, default 1 (shift);
enable or disable 3' shifting of HGVS notations
protein: logical
, default FALSE; add Ensembl protein ID's
symbol: logical
, default FALSE; add gene symbol
(e.g. HGNC) (where available) to the output
ccds: logical
, default FALSE; add CCDS transcript ID's
uniprot: logical
, default FALSE;
adds identifiers for translated protein products from three
UniProt-related databases
tsl: logical
, default FALSE;
adds the transcript support level for this transcript
canonical: logical
, default FALSE;
indicate if transcript is cononical transcript for the gene
biotype: logical
, default FALSE; add biotype of
transcript
xref_seq: logical
, default FALSE; output aligned refseq
mRNA ID
colocatedVariants
list
of the following options:
check_existing: logical
, default FALSE; check for
co-located variants
check_alleles: logical
, default FALSE; when checking for
co-located variants only report them if none of the alleles
supplied are novel
check_svs: logical
, default FALSE; check for
structural variants that overlap the input variants
gmaf: logical
, default FALSE; add global minor allele
frequence (MAF) from 1000 Genomes Phase 1 data
maf_1kg: logical
, default FALSE; add MAF from
continental populations of 1000 Genomes Phase 1 data;
must be use with –cache
maf_esp: logical
, default FALSE; add MAF from
NHLBI-ESP populations; must be used with –cache
old_maf: logical
, default FALSE;
for maf_1kg and maf_esp report only the frequency (no allele) and
convert so it is always a minor frequency, i.e. < 0.5
pubmed: logical
, default FALSE;
report Pubmed IDs for publications that cite existing variant;
must be used with –cache
failed: logical
, default FALSE; when checking for
co-located variants include or exclude variants that have been
flagged as failed
dataformat
list
of the following options:
vcf: logical
, default FALSE; write output in vcf format
json: logical
, default FALSE; write output in json format
gvf: logical
, default FALSE; write output in gcf format
fields: character
, default fields are
'Uploaded_variation', 'Location', 'Allele', 'Gene', 'Feature',
'Feature_type', 'Consequence', 'cDNA_position', 'CDS_position',
'Protein_position', 'Amino_acids', 'Codons' and 'Extra'. See
http://www.ensembl.org/info/docs/variation/vep/vep_formats.html#sv
for details.
convert: character
, default character()
;
converts input file to one of 'ensembl', 'vcf', or 'pileup'
minimal: logical
, default FALSE; convert alleles to
their most minimal representation before consequence calculation
filterqc
list
of the following options:
check_ref: logical
, default FALSE; force check of
supplied reference allele against the sequence stored in Ensembl
Core database
coding_only: logical
, default FALSE; return
consequences in coding regions only
chr: character
, default character()
; select
a subset of chromosomes to be analyzed
no_intergenic: logical
, default FALSE; do not
include intergenic consequences
pick: logical
, default FALSE;
pick once line of consequence data per variant
pick_allele: logical
, default FALSE;
pick once line of consequence data per variant allele
flag_pick: logical
, default FALSE;
as per –pick, but adds the PICK flag to the chosen block of
consequence data and retains others.
flag_pick_allele: logical
, default FALSE;
as per –pick_allele, but adds the PICK flag to the chosen block
of consequence data and retains others.
per_gene: logical
, default FALSE;
output only the most severe consequence per gene
pick_order: character
, See ensembl web page for
default order; customise the order of criteria applied when
choosing a block of annotation data with e.g. –pick.
most_severe: logical
, default FALSE; output only most
severe consequence per variation
summary: logical
, default FALSE; output a comma-separated
list of all observed consequences per variation, transcript-specific
columns will be left blank
filter_common: logical
, default FALSE; shortcut flag
to turn on filters, See web page for details.
check_frequency: logical
, default FALSE; turn on
frequency filtering, must also specify all of the
–freq_* flags. See web page for details.
freq_pop: character
, default character()
;
population to use in frequency filter
freq_freq: numeric
, default numeric()
;
MAF to use in frequency filter
freq_gt_lt: character
, default character()
;
specify whether the frequency of the co-located variant must
be greater than or less than the value specified. Values
are 'gt' or 'lt'.
in the freq_freq
option.
freq_filter: character
, default character()
;
specify whether to exclude or include variants that pass
the frequency filter. Values are 'exclude' or 'include'.
allow_non_variant: logical
, default FALSE; when using
VCF format as input and output, by default VEP will skip all
non-variant lines of input (i.e., where the ALT is NULL). When
this option is enabled, lines will be printed in the VCF output
with no consequence data added.
database
list
of the following options:
database: logical
, default TRUE; enable the VEP to
use local or remote databases
host: character
, default character()
;
database host. This will use the default as defined by vep
'ensembldb.ensembl.org'. Users in the US may find connection
and transfer speeds quicker using the East coast mirror,
'useastdb.ensembl.org'.
user: character
default character()
;
database user
password: character
, default character()
;
database password
port: numeric
, default character()
;
database port
genomes: logical
, default FALSE; override default
connection settings with those for the Ensembl Genomces public
MySQL server
gencode_basic: logical
, default FALSE;
limit analysis to transcripts in GENCODE basic set
refseq: logical
, default FALSE; use otherfeatures
database to retrieve transcripts
merged: logical
, default FALSE;
use the merged Ensembl and RefSeq cache
all_refseq: logical
, default FALSE;
include e.g. CCDS and Ensembl EST transcripts
lrg: logical
, default FALSE;
map input variants to LRG coordinates
db_version: numeric
, default character()
;
force connection to specific version
registry: character
, default character()
;
provide file to override default connection settings
advanced
list
of the following options:
no_whole_genome: logical
, default FALSE; run in
non-whole genome mode, variants analyzed one at a time, no caching
buffer_size: numeric
, default 5000; internal buffer
size corresponding to number of variations read into memory
simultaneously
write_cache: logical
, default FALSE; enable writing
to the cache
build: character
, default character()
; build
cache for the selected species from the database (See –chr flag)
compress: character
, default character()
;
specify utility to decompress cached files (zcat is default)
skip_db_check: logical
, default FALSE; force the script
to use a cache built from a different host than specified with
–host
cache_region_size: numeric
, default numeric()
;
size in base-pairs of the region covered by one file in the cache,
see full description of this flag on the web site for details
Valerie Obenchain
The ensemblVEP
function man page.
The VEPParam
class man page.
## See ?VEPParam for examples of constructing instances of a
## VEPParam object with different runtime options.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.