VariantFilteringParam-class: VariantFiltering parameter class
In rcastelo/VariantFiltering: Filtering of coding and non-coding genetic variants

VariantFilteringParam-class

R Documentation

VariantFiltering parameter class

Description

The class VariantFilteringParam is defined to ease configuring the call to the functions that filter input genetic variants according to a desired segregating inheritance model (xLinked(), autosomalRecessiveHomozygous(), etc).

Usage

VariantFilteringParam(vcfFilename, pedFilename=NA_character_,
                      bsgenome="BSgenome.Hsapiens.1000genomes.hs37d5",
                      orgdb="org.Hs.eg.db",
                      txdb="TxDb.Hsapiens.UCSC.hg19.knownGene",
                      snpdb="SNPlocs.Hsapiens.dbSNP144.GRCh37",
                      weightMatricesFilenames=NA,
                      weightMatricesLocations=rep(list(variantLocations()), length(weightMatricesFilenames)),
                      weightMatricesStrictLocations=rep(list(FALSE), length(weightMatricesFilenames)),
                      radicalAAchangeFilename=file.path(system.file("extdata",
                                                                    package="VariantFiltering"),
                                                        "AA_chemical_properties_HanadaGojoboriLi2006.tsv"),
                      codonusageFilename=file.path(system.file("extdata",
                                                               package="VariantFiltering"),
                                                   "humanCodonUsage.txt"),
                      geneticCode=getGeneticCode("SGC0"),
                      allTranscripts=FALSE,
                      regionAnnotations=list(CodingVariants(), IntronVariants(),
                                             FiveSpliceSiteVariants(), ThreeSpliceSiteVariants(),
                                             PromoterVariants(), FiveUTRVariants(), ThreeUTRVariants()),
                      intergenic=FALSE,
                      otherAnnotations=c("MafDb.1Kgenomes.phase1.hs37d5",
                                         "PolyPhen.Hsapiens.dbSNP131",
                                         "SIFT.Hsapiens.dbSNP137",
                                         "phastCons100way.UCSC.hg19",
                                         "humanGenesPhylostrata"),
                      geneKeytype=NA_character_,
                      yieldSize=NA_integer_)
## S4 method for signature 'VariantFilteringParam'
show(object)
## S4 method for signature 'VariantFilteringParam'
x$name
## S4 method for signature 'VariantFilteringParam'
names(x)

Arguments

`vcfFilename`	Character string of the input VCF file name.
`pedFilename`	Character string of the pedigree file name in PED format.
`bsgenome`	Character string of a genome annotation package (`BSgenome.Hsapiens.1000genomes.hs37d5` by default).
`orgdb`	Character string of a gene-centric annotation package (`org.Hs.eg.db` by default).
`txdb`	Character string of a transcript-centric annotation package (`TxDb.Hsapiens.UCSC.hg19.knownGene` by default). The package `GenomicFeatures` provides infraestructure to build such annotation packages from different sources such as online UCSC tracks, Biomart tables, or `GFF` files.
`snpdb`	Character string of a SNP-centric annotation package (`SNPlocs.Hsapiens.dbSNP.20120608` by default).
`weightMatricesFilenames`	Character string of filenames of position weight matrices for binding site recognition. The default `NA` value indicates that no binding sites will be scored. To use this feature to score, for instance, splice sites in human, assign to this argument the function `spliceSiteMatricesHuman()`. See the files (`hsap.donors.hcmc10_15_1.ibn` and `hsap.acceptors.hcmc10_15_1.ibn`) returned by this function for details on their format.
`weightMatricesLocations`	Keywords of the annotated locations to variants under which a weight matrix will be used for scoring binding sites. This argument is only used when `weightMatricesFilenames!=NA` and, in such case, then more than one matrix is provided, this argument should be a list of character vectors with as many elements as matrices given in `weightMatricesFilenames`. The possible values can be obtained by typing `variantLocations()`.
`weightMatricesStrictLocations`	Logical vector flagging whether a weight matrix should be scoring binding sites strictly within the boundaries of the given locations. This argument is only used when `weightMatricesFilenames!=NA` and, in such case, then more than one matrix is provided, this argument should be a list of logical vectors with as many elements as matrices given in `weightMatricesFilenames`.
`radicalAAchangeFilename`	Name of a tab-separated text file containing chemical properties of amino acids. These properties are interpreted such that amino acid changes within a property are considered "conservative" and between properties are considered "radical". See the default file (`AA_chemical_properties_HanadaGojoboriLi2006.tsv`) for details on its format.
`codonusageFilename`	Name of a text file containing the codon usage.
`geneticCode`	Name character vector of length 64 describing the genetic code. The default value is `getGeneticCode("SGC0")`, the standard genetic code. An alternative genetic code, for instance, is `getGeneticCode("SGC1")`, the vertebrate mitochondrial genetic code. See `getGeneticCode` in the Biostrings package for further details.
`allTranscripts`	Logical. This option allows the user to choose between working with all the transcripts affected by the variant (`allTranscripts=TRUE`) or with only one transcript per variant.
`regionAnnotations`	List of `VariantType-class` objects defining what regions to annotate.
`intergenic`	Logical. When `TRUE`, the intergenic variants are also annotated.
`otherAnnotations`	Character vector of names of annotation packages or annotation objects.
`geneKeytype`	Character vector of the type of key gene identifier provided by the transcript-centric (TxDb) annotation package to interrogate the organism-centric (OrgDb) annotation package. The default value (`NA_character_` indicates that it will be assumed to be an Entrez identifier unless the values in the `GENEID` column returned by the TxDb package start with `ENSG` and then it will be assumed that they are Ensembl gene identifiers, or with one of `NM_, NP_, NR_, XM_, XP_, XR_ or YP_` and then it will be assumed that they are RefSeq gene identifiers.
`yieldSize`	Number of variants to yield each time the input VCF file is read. This argument is passed to the `TabixFile` function when opening the input VCF file and it allows to iterate through the variants in chunks of the given size to limit the memory requirements. Its default value (`NA_integer_`) implies that the whole input VCF file will be read into main memory.
`object`	A VariantFilteringParam object created through `VariantFilteringParam()`.
`x`	A VariantFilteringParam object created through `VariantFilteringParam()`.
`name`	Slot name of a VariantFilteringParam object. Use `names()` to find out what these slots are.

Details

The class VariantFilteringParam serves as a purpose of simplifying the call to the inheritance model function and its subsequent annotation and filtering steps. It also groups all the parameters that the user can customize (i.e newer versions of the annotation packages, when available).

The method VariantFilteringParam() creates an VariantFilteringParam object used as an input argument to other functions such as autosomalRecessiveHomozygous(), etc.

The method names() allows one to see the names of the slots from a VariantFilteringParam object. Using the $ operator, one can retrieve the values of these slots in an analogous way to a list.

Value

An VariantFilteringParam object is returned by the method VariantFilteringParam.

Author(s)

D.M. Elurbe, P. Puigdevall and R. Castelo

Examples

vfpar <- VariantFilteringParam(system.file("extdata", "CEUtrio.vcf.bgz", package="VariantFiltering"),
                               system.file("extdata", "CEUtrio.ped", package="VariantFiltering"),
                               snpdb=character(0), otherAnnotations=character(0))
vfpar
names(vfpar)
vfpar$vcfFiles

rcastelo/VariantFiltering documentation built on July 5, 2025, 5:38 a.m.