phaseChromosome: Wrapper function for StrandPhaseR to phase a single...

View source: R/phaseChromsome.R

phaseChromosomeR Documentation

Wrapper function for StrandPhaseR to phase a single chromosome.

Description

This function will move through .bam files in a folder and perform several steps (see Details).

Usage

phaseChromosome(
  inputfolder,
  outputfolder = "./StrandPhaseR_analysis",
  positions = NULL,
  WCregions = NULL,
  chromosome = NULL,
  pairedEndReads = TRUE,
  min.mapq = 10,
  min.baseq = 30,
  num.iterations = 2,
  translateBases = TRUE,
  concordance = 0.9,
  fillMissAllele = NULL,
  splitPhasedReads = FALSE,
  compareSingleCells = FALSE,
  exportVCF = NULL,
  bsGenome = NULL,
  ref.fasta = NULL,
  assume.biallelic = FALSE
)

Arguments

inputfolder

Path to the bam files to process

outputfolder

Output directory. If non-existent it will be created.

positions

Filename with listed position of SNVs for given chromosome (format: chrName SNVpos).

WCregions

Filename of all WC region for a given chromosome (format: chrName:Start:End:FileName).

chromosome

If only a subset of the chromosomes should be processed, specify them here.

pairedEndReads

Set to TRUE if you have paired-end reads in your file.

min.mapq

Minimum mapping quality when importing from BAM files.

min.baseq

Minimum base quality to consider a base for phasing.

num.iterations

Number of iteration to sort watson and crick matrices.

translateBases

translates integer coded bases (1,2,3,4) into letters (A,C,G,T)

concordance

Level of agreement between single cell and consensus haplotypes

fillMissAllele

A patch to a single BAM or VCF file for a given sample to be used to fill missing alleles, uncovered in Strand-seq data.

splitPhasedReads

Set to TRUE if you want to split reads per haplotype.

compareSingleCells

Set to TRUE if you want to compare haplotypes at the single-cell level.

exportVCF

Ideally a sample ID that if defined invokes export of phased haplotypes in a separate VCF file.

bsGenome

A BSgenome object which contains reference genome used to infer reference alleles.

ref.fasta

A user defined reference FASTA file to extract reference allele for all SNV positions.

assume.biallelic

If set to TRUE parameter 'snv.positions' is expected to contain biallelic loci (0/1, 1/0) and thus gaps in haplotypes will be filled accordingly.

Details

1. extract variable position in WC regions 2. Fill two matrices separately for SNVs found in Watson and Crick reads 3. Sort matrices in order each column in each matrix has lowest amount of conflicting bases 4. Exclude rows/cells which cannot be reliably assigned to only one matrix consensus 5. For successfully phased rows/cell export W and C reads as a separate haplotype specifiv GRanges object

Author(s)

David Porubsky


daewoooo/StrandPhaseR documentation built on April 7, 2024, 7:13 p.m.