phase: Phasing of alleles

Description Usage Arguments Details Value Examples

Description

phase phases alleles using allelic counts or genotype calls from single-cell RNA-seq data.

Usage

1
2
phase(acset, input = "ac", weigh = FALSE, method = "exhaust",
  nvars_max = 10, verbosity = -1, bp_param = BiocParallel::SerialParam())

Arguments

acset

An acset list created by the new_acset function. It must contain elements with either a genotype matrix or two matrixes containing the reference and alternative allele counts, respectively.

input

A character string specifying if allele counts or genotype calls should be used for phasing. Two values are allowed, 'gt' or 'ac'. 'gt' specifies that genotype calls should be used. 'ac' specifies that allele counts should be used.

weigh

A logical specifying if the sample-size, that is, the scale of the allele counts at a variant, should be taken into account. Variants with high counts will be given a greater weight as they are more reliable.

method

A character string specifying the clustering method to be used for the phasing. Two values are allowed, 'exhaust' or 'pam'.

nvars_max

An integer specifying the number of variants within a feature (e.g. gene), above which 'pam' clustering will be used even if the method argument was set to 'exhaust'.

verbosity

An integer specifying the verbosity level. Higher values increase the verbosity.

bp_param

A BiocParallelParam instance, see bplapply.

Details

The function phases alleles within each feature. As phasing is not done between features inferred haplotypes should only be used within features.

Value

An acset list with the following elements added by the phasing function: 'phasedfeat': Data-frame with six columns. Four first columns, 'feat', 'var', 'ref' and 'alt' are taken from the featdata data-frame of the input acset. The last two columns 'hapA' and 'hapB' contains the haplotype sequence of alleles. 'args': List where each element corresponds to argument values supplied to the phase function. 'varflip': Character vector with names of variants where the alleles were swapped. 'score': Numeric vector with a variability score per feature after phasing. 'gt_phased': Character matrix of genotypes after having swapped alleles according to the inferred phase. 'weights': Numeric matrix with a weight per variant and cell. Only added if arguments set as weigh == TRUE and input == 'gt'.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
##create a small artificial genotype matrix
ncells = 10
paternal = c(0, 2, 0, 0, 2)
maternal = c(2, 0, 2, 2, 0)
gt = as.matrix(as.data.frame(rep(list(paternal, maternal), ncells / 2)))
vars = 1:nrow(gt)
colnames(gt) = 1:ncells
rownames(gt) = vars

##feature annotation data-frame
nvars = nrow(gt)
featdata = as.data.frame(matrix(cbind(rep('jfeat', nvars),
as.character(1:nvars), rep('dummy', nvars), rep('dummy', nvars)), ncol = 4,
dimnames = list(vars, c('feat', 'var', 'ref', 'alt'))), stringsAsFactors =
FALSE)

##create acset
acset = new_acset(featdata, gt = gt)

##phase
acset = phase(acset, input = 'gt', weigh = FALSE, method = 'exhaust',
verbosity = 0)

##' The haplotype output is contained in an element of an acset list that was
##added by the phasing function and is named "phasedfeat":
head(acset[['phasedfeat']])

edsgard/scphaser documentation built on May 15, 2019, 11:02 p.m.