indelDiff: main function to call indel difference between the bam files

Description Usage Arguments Value Author(s) References Examples

View source: R/indelDiff.R

Description

test indel-read count differences at a given indel position between the two bam files. The indel position are obtained by samtools+bcftools first, and count the number of reads that span no less than 3bp of the indel boundary. The read-count matrix at a given indel region from the two bam files are tested by fisher exact test and euclidean distance. If nothing difference, NULL will be returned.

Usage

1
indelDiff(bam1, bam2, refFsa, regChr, regStart, regEnd, minBaseQuality = 13, minMapQuality = 0, nCores = 1, pValueCutOff= 0.05,gtDistCutOff = 0.1,verbose = TRUE)

Arguments

bam1

the first bam file to be compared

bam2

the second bam file to be compared

refFsa

the reference fasta file used for bam1 and bam2 alignments

regChr

chromosome name of the region of interest, it should match the chromosome name in reference name

regStart

the start position (1-based) of the region of interest

regEnd

the end position (1-based) of the region of interest

minBaseQuality

the minimum base quality to be used for indel-read count

minMapQuality

the minimum read mapping quality to be used for indel-read count

nCores

number of thread used for calculate in parallel

pValueCutOff

p.value cutoff from fisher.test to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting pValueCutOff = 1.

gtDistCutOff

euclidean distance cutoff from dist(,method='euclidean') to display output. If there is no difference between two compared positions (p.value = 1 and d.value = 0), NULL will be returned even setting gtDistCutOff = 0.

verbose

print progress on screen, default = TRUE.

Value

indelDiff: returns a data.frame with difference information: chromosome, position, reference genenotype, two alt genotypes, and their indel-read count for two bam files, p.value (fisher exact test of these read counts) and d.value (euclidean distance of these read counts)

Author(s)

Xiaobin Xing, <email:xiaobinxing0316@gmail.com>

References

Li H.*, Handsaker B.*, Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics, 25, 2078-9. [PMID: 19505943]

Examples

1
2
3
4
5
bam1 <- system.file(package='SICtools','extdata','example1.bam')
bam2 <- system.file(package='SICtools','extdata','example2.bam')
refFsa <- system.file(package='SICtools','extdata','example.ref.fasta')

indelDiff(bam1,bam2,refFsa,'chr07',828514,828914,pValueCutOff=1,gtDistCutOff=0)

SICtools documentation built on Nov. 8, 2020, 5:12 p.m.