haplotype: Function to load tumour allele counts from a text file or...
In gavinha/TitanCNA: Subclonal copy number and LOH prediction from whole genome sequencing of tumours

Description Usage Arguments Value Author(s) References See Also Examples

Function to load in the allele counts from tumour sequencing data from a delimited text file or data.frame object.

    loadHaplotypeAlleleCounts(inCounts, cnfile, fun = "sum", haplotypeBinSize = 1e5, 
      minSNPsInBin = 3, chrs = c(1:22, "X"), minNormQual = 200, 
      genomeStyle = "NCBI", sep = "\t", header = TRUE, seqinfo = NULL,
      mapWig = NULL, mapThres = 0.9, centromere = NULL, minDepth = 10, maxDepth = 1000)
    
    getHaplotypesFromVCF(vcfFile, chrs = c(1:22, "X"), build = "hg19", genomeStyle = "NCBI",
      filterFlags = c("PASS", "10X_RESCUED_MOLECULE_HIGH_DIVERSITY"), 
      minQUAL = 100, minDepth = 10, minVAF = 0.25, altCountField = "AD", 
      keepGenotypes = c("1|0", "0|1", "0/1"), snpDB = NULL)
      
    loadBXcountsFromBEDDir(bxDir, chrs = c(1:22, "X", "Y"), minReads = 2)

`inCounts`	Path to text file or data.frame containing tumour allele count data. `inCounts` must be 6 columns: chromosome, position, reference base, reference read counts, non-reference base, non-reference read counts. ‘chromosome’ column can be in ‘NCBI’ or ‘UCSC’ genome style; only autosomes, sex chromosomes, and mitochondrial chromosome are included (e.g. 1-22,X,Y,MT). The reference and non-reference base columns can be any arbitrary character; it is not used by TitanCNA.
`cnfile`	Path to file containing GC-bias and maappability corrected molecule coverage for given bin size.
`vcfFile`	Path to phased variant VCF file from LongRanger 2.1. The file name must have the suffix `*phase_variants.vcf.gz`.
`bxDir`	Path to directory containing tumor bed files for each chromosome containing BX tags.
`fun`	The function (‘SNP’, ‘sum’, ‘mean’) to use to summarize within each user defined bin using `haplotypeBinSize` and haplotype block defined by the phaseSet ID from thte 9th column of `inCounts`. ‘SNP’ - uses the phased allele counts each individual SNP; phased allele for the higher coverage (determined within each bin) haplotype is chosen. ‘sum’ - uses the read count sum across all phased SNPs for the higher coverage haplotype within a bin normalized by the total depth across all SNPs in a bin; each SNP in the bin is assigned this fraction. ‘mean’ - uses the mean (rounded) read count across all phased SNPs for the higher coverage haplotype within a bin normalized by the mean (rounded) depth across all SNPs in a bin; each SNP in the bin is assigned this rounded count and depth.
`haplotypeBinSize`	Bin size used to summarize SNPs based on phased haplotypes. See `fun` for the summarization approaches within a bin.
`minSNPsInBin`	The minimum number of SNPs required in each `haplotypeBinSize` for analysis. See `fun` for the summarization approaches within a bin.
`chrs`	Vector containing list of chromosomes to include in output.
`minNormQual`	Quality threshold to use for filtering; SNPs with lower than this value are excluded. This quality is any metric that provides the confidence of the locus being a true germline heterozygous SNP.
`minReads`	Minimum number of reads per barcode.
`genomeStyle`	The genome style to use for chromosomes. Use one of ‘NCBI’ or ‘UCSC’. It does not matter what style is found in `inCounts`, `genomeStyle` will be the style returned. Invokes `setGenomeStyle`.
`build`	Human genome reference build. Default: hg19.
`snpDB`	Path to SNP VCF file to use for specifying sites to retain.
`minQUAL`	Variants with quality (QUAL field) greater or equal to this value will be retained.
`minDepth`	Variants with read depth greater than or equal to this value will be retained.
`maxDepth`	Variants with read depth lower than or equal to this value will be retained.
`minVAF`	Variants with a variant/reference allele fraction of greater than or equal to this value will be retained.
`altCountField`	Specify the alternate count field name. Defaulat is "AD".
`keepGenotypes`	Genotypes to retain. Default is to keep these genotypes strings: 1\|0, 0\|1, 0/1
`filterFlags`	Specify the FILTER flags to retain.
`sep`	Character indicating the delimiter used for the columns for `infile`. Default is tab-delimited, "\t".
`header`	`logical` to indicate if the input tumour counts file contains a header line.
`seqinfo`	`Seqinfo-class` object describing chromosome information. If `NULL`, then will load seqinfo for hg19 `system.files('extdata', 'Seqinfo_hg19.rda', package='TitanCNA'`.
`mapWig`	Mappability score WIG file for binned data.
`mapThres`	Minimum mappability score of region/sequence overlapping variants to retain.
`centromere`	File containing reference genome gap file representing centromere locations. Usually obtained from UCSC.

loadHaplotypeAlleleCounts returns a data.table containing components for

`chr`	Chromosome; character, `genomeStyle` naming convention
`posn`	Position; integer
`phaseSet`	Phase block identifier, numeric or character
`refOriginal`	Reference allele read count at SNP; numeric
`tumDepthOriginal`	Coverage at SNP; numeric
`ref`	Phased allele count values of higher coverage haplotype based on approach used (SNP, sum, mean); numeric
`nonRef`	Phased allele count values of lower coverage haplotype; tumDepth minus ref; numeric
`tumDepth`	Mean or sum of SNP read coverage; numeric
`HapltypeRatio`	Sum of read coverage of phased alleles of higher coverage haplotype normalized by `tumDepth`; numeric
`haplotypeCount`	Phased allele read count; numeric

getHaplotypesFromVCF returns a list containing 2 components

`vcf.filtered`	VCF object containing the list of heterozygous variants after filtering.
`geno.gr`	GRanges object containing the genotype information of the VCF

Gavin Ha <gavinha@gmail.com>

Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E., Biele, J., Ding, J., Le, A., Rosner, J., Shumansky, K., Marra, M. A., Huntsman, D. G., McAlpine, J. N., Aparicio, S. A. J. R., and Shah, S. P. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumour whole genome sequence data. Genome Research, 24: 1881-1893. (PMID: 25060187)

loadDefaultParameters, plotHaplotypeFraction

  ## Not run: 
  infile <- "test_alleleCounts_chr2_with_phaseInfo.txt"
  haplotypeBinSize <- 1e5
  phaseSummarizeFun <- "sum"
  ## will load seqinfo_hg19 provided by TitanCNA package
  data <- loadHaplotypeAlleleCounts(infile, fun = phaseSummarizeFun,
      haplotypeBinSize = haplotypeBinSize, minSNPsInBin = 3, 
      chrs = c(1:22, "X"), minNormQual = 200, 
      genomeStyle = "NCBI", seqinfo = NULL)
  
## End(Not run)
  
  ## Not run: 
  vcfFile <- "test.vcf"
  hap <- getHaplotypesFromVCF(vcfFile, chrs = c(1:22,"X"), build = "hg19",
    filterFlags = c("PASS", "10X_RESCUED_MOLECULE_HIGH_DIVERSITY"), 
    minQUAL = 100, minDepth = 10, minVAF = 0.25, 
    keepGenotypes = ("1|0", "0|1", "0/1"))
  
  
## End(Not run)

gavinha/TitanCNA documentation built on April 22, 2021, 9:38 a.m.

gavinha/TitanCNA index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gavinha/TitanCNA
Subclonal copy number and LOH prediction from whole genome sequencing of tumours

haplotype: Function to load tumour allele counts from a text file or...
In gavinha/TitanCNA: Subclonal copy number and LOH prediction from whole genome sequencing of tumours

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to haplotype in gavinha/TitanCNA...

R Package Documentation

Browse R Packages

We want your feedback!

gavinha/TitanCNA Subclonal copy number and LOH prediction from whole genome sequencing of tumours

haplotype: Function to load tumour allele counts from a text file or... In gavinha/TitanCNA: Subclonal copy number and LOH prediction from whole genome sequencing of tumours

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to haplotype in gavinha/TitanCNA...

R Package Documentation

Browse R Packages

We want your feedback!

gavinha/TitanCNA
Subclonal copy number and LOH prediction from whole genome sequencing of tumours

haplotype: Function to load tumour allele counts from a text file or...
In gavinha/TitanCNA: Subclonal copy number and LOH prediction from whole genome sequencing of tumours