getGenicDP: Get expression at the gene level

Description Usage Arguments Value See Also Examples

View source: R/getGenic.R

Description

Calculate allele specific expression for each gene in each sample, either using only the most expressed SNP or using all SNPs (when phasing has been performed).

Usage

1
2
getGenicDP(dt_anno, highest_expr = TRUE, pool = FALSE,
  gender_file = NULL)

Arguments

dt_anno

A data.table. An annotated table of read counts for each SNP, as outputted by addAnno

highest_expr

A logical. If FALSE, all SNPs will be summed within each gene. This should only be set to FALSE when high quality phasing information is available. If set to TRUE, the highest expressed SNP (across both alleles) will be used instead.

pool

A logical. Only works when highest_expr is set to TRUE. If set to TRUE, the read counts are pooled accross all samples for each SNP. Only use this if the samples come from the same subject

gender_file

A character or NULL. Leave NULL if dt_anno already contains a gender column. The file must contain at least a "sample" and "gender" column with samples matching the samples in dt_anno.

Value

A data.table. That should be used as input for betaBinomXI.

See Also

betaBinomXI, addAnno

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Example workflow for documentation

vcff <- system.file("extdata/AD_example.vcf", package = "XCIR")
# Reading functions
vcf <- readRNASNPs(vcff)
vcf <- readVCF4(vcff)

# Annotation functions
# Using seqminer (requires additional annotation files)

anno <- addAnno(vcf)

# Using biomaRt
anno <- annotateX(vcf)
# Do not remove SNPs with 0 count on minor allele
anno0 <- annotateX(vcf, het_cutoff = 0)

# Summarise read counts per gene
# Assuming data is phased, reads can be summed across genes.
genic <- getGenicDP(anno, highest_expr = FALSE)
# Unphased data, select SNP with highest overall expression.
genic <- getGenicDP(anno, highest_expr = TRUE)

XCIR documentation built on Nov. 8, 2020, 7:41 p.m.