importGenomicVariants: Import genomic variants into the ProteoDiscography

View source: R/import_genomicVariants.R

importGenomicVariantsR Documentation

Import genomic variants into the ProteoDiscography

Description

Imports genomic variants (SNV, MNV and InDels) present within the supplied VCF/MAF files into the ProteoDiscography as a VRanges. This genomic variants can later be incorporated within transcript sequences at a later stage.

Usage

importGenomicVariants(
  ProteoDiscography,
  files,
  samplenames = NULL,
  removeExisting = FALSE,
  overwriteDuplicateSamples = TRUE,
  performAnchorCheck = TRUE,
  ignoreNonMatch = FALSE,
  threads = 1
)

Arguments

ProteoDiscography

(ProteoDiscography): ProteoDiscography object which stores the annotation and genomic sequences.

files

(character): Path(s) to VCF or MAF files.

samplenames

(character): Descriptive samplename(s) of the VCF files in the same order as input VCF file(s), if NULL the basename of the file will be used instead.

removeExisting

(logical): Should previous mutations within the ProteoDiscography be removed?

overwriteDuplicateSamples

(logical): Replace duplicate samples (TRUE) or throw an error if duplicate samples are found.

performAnchorCheck

(logical): Should the reference anchor be check for consistency with the given genomic sequences?

ignoreNonMatch

(logical): Should non-matching reference anchors be ignored? These mutations will be removed prior to appending.

threads

(integer): Number of threads.

Value

ProteoDiscography with additional imported SNVs, MNVs and InDels.

Author(s)

Job van Riet j.vanriet@erasmusmc.nl

Wesley van de Geer w.vandegeer@erasmusmc.nl

Examples


ProteoDiscography.hg19 <- ProteoDisco::generateProteoDiscography(
  TxDb = TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene, 
  genomeSeqs = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19
)

# Supply the ProteoDiscography with genomic variants to incorporate in downstream analysis. This can be one or multiple VCF / MAF files.
# Additional manual sequences and exon-exon mapping (i.e., splice junctions) can also be given as shown in the sections below.
ProteoDiscography.hg19 <- ProteoDisco::importGenomicVariants(
  ProteoDiscography = ProteoDiscography.hg19,
  # Provide the VCF / MAF files, if more then one supply a vector of files and corresponding samplenames.
  files = system.file('extdata', 'validationSet_hg19.vcf', package = 'ProteoDisco'), 
  # We can replace the original samples within the VCF with nicer names.
  samplenames = 'Validation Set (GRCh37)',
  # Number of threads used for parallelization.
  # We run samples sequentially and parallelize within (variant-wise multi-threading).
  threads = 1, 
  # To increase import-speed for this example, do not check for validity of the reference anchor with the given reference sequences.
  performAnchorCheck = FALSE
)
 

ErasmusMC-CCBC/ProteoDisco documentation built on Dec. 9, 2022, 8:41 a.m.