addAnno: Read annotation file

Description Usage Arguments Details Value See Also Examples

View source: R/annotate.R

Description

Read a given annotation file and merge it with a data.table containing the relevant information to estimate inactivated X chromosome expression and filter out SNPs with low coverage.

Usage

1
2
addAnno(dt, seqm_annotate = TRUE, read_count_cutoff = 20,
  het_cutoff = 3, filter_pool_cutoff = 3, anno_file = NULL)

Arguments

dt

A data.table object.

seqm_annotate

A logical. If set to TRUE, the seqminer package will be used to annotate dt. If set to FALSE, this function is a simple read count filtering step.

read_count_cutoff

A numeric. Keep only SNPs that have at least that many reads.

het_cutoff

A numeric. Keep only SNPs that have at least that many reads on each allele.

filter_pool_cutoff

A numeric. Keep only SNPs that have at least that many reads on each allele across all samples. See details for more information.

anno_file

A character. The name of a file containing annotations.

Details

If the samples all have the same genotype (e.g: technical replicates), filter_pool_cutoff will sum counts across samples and preserve SNPs that pass the cutoff on both the reference and alternate alleles. This may lead to samples with 0 counts on either allele but will prevent removing heterozygous sites with lower coverage (especialliy in skewed samples). seqm_anno will call annotatePlain from the seqminer package. For convenience, seqminer's necessary annotation sources can be copied into XCIR's extdata folder. See ?annotatePlain for more information.

Value

A data.table object that contains allelic coverage, genotype and annotations at the covered SNPs.

See Also

annotatePlain

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Example workflow for documentation

vcff <- system.file("extdata/AD_example.vcf", package = "XCIR")
# Reading functions
vcf <- readRNASNPs(vcff)
vcf <- readVCF4(vcff)

# Annotation functions
# Using seqminer (requires additional annotation files)

anno <- addAnno(vcf)

# Using biomaRt
anno <- annotateX(vcf)
# Do not remove SNPs with 0 count on minor allele
anno0 <- annotateX(vcf, het_cutoff = 0)

# Summarise read counts per gene
# Assuming data is phased, reads can be summed across genes.
genic <- getGenicDP(anno, highest_expr = FALSE)
# Unphased data, select SNP with highest overall expression.
genic <- getGenicDP(anno, highest_expr = TRUE)

XCIR documentation built on Nov. 8, 2020, 7:41 p.m.