equal_compare: Fair base call comparisons.

Description Usage Arguments Details Value Examples

Description

DeepSNV calls minority variants according to a plasmid control which means that often the reference bases is the major allele in the sample (but not always). The DeepSNV script provides all alleles present at a variable site. In this data set there is only ever 2 but there could be more. In comparing two samples it is likely that sample 1 contains a minority variant at some position that in sample 2 is characterized by only one allele. If the allele in sample 2 is the plasmid reference allele there will be no record of it in the data frame. This function adds that data so that it is clear the samples contain the same major allele (or different major alleles if that is the case). If both samples contain only the major allele and it is the same allele an empty data frame is returned as we are not interested in cases were the alleles are the same and there is no minor allele.

Usage

1
equal_compare(position)

Arguments

position

A data frame containing one row for each base called at a loci for the two samples in question.

Details

If there is only one allele but we might expect two given the frequency then we warn the user. This can happen when one allele is removed because of poor quality. We do not adjust the frequency of the remaining alleles in this case. If the present allele matches the only allele in the other sample then the position is removed as the data suggests both samples have only one allele and the same allele (at that).

Value

A data frame with either the missing allele updated (in the appropriate sample), the missing allele added as a new entry in the data frame, or an empty data frame (in the case where the is only one allele and it is fixed in both samples)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Here we add the reference base in the correct row.
position <- dplyr::tibble(ENROLLID1 = c(300294,300294),
                       ENROLLID2 = c(300293,300293),
                       mutation = c("M_C806A","M_C806C"),
                       freq1 = c(0,0),
                       freq2 = c(0.9,0.1),
                       chr = c("M","M"),
                       pos = c(806,806),
                       ref = c("C","C"),
                       var = c("A","C"))
print(position)
equal_compare(position)

# Here we add a row when the major alleles differ.
position <- dplyr::tibble(ENROLLID1 = c(300294),
                       ENROLLID2 = c(300293),
                       mutation = c("M_C806A"),
                       freq1 = c(0),
                       freq2 = c(0.99),
                       chr = c("M"),
                       pos = c(806),
                       ref = c("C"),
                       var = c("A"))
print(position)
equal_compare(position)

# Here we remove the entry as the samples are the same
position <- dplyr::tibble(ENROLLID1 = c(300294),
                       ENROLLID2 = c(300293),
                       mutation = c("M_C806A"),
                       freq1 = c(1),
                       freq2 = c(0.99),
                       chr = c("M"),
                       pos = c(806),
                       ref = c("C"),
                       var = c("A"))
print(position)
equal_compare(position)

jtmccr1/HIVEr documentation built on May 29, 2019, 1:50 a.m.