pseudoRef: 'pseudoRef' Make a pseudo reference genome.

Description Usage Arguments Value Examples

View source: R/pseudoRef.R

Description

pseudoRef Make a pseudo reference genome.

Usage

1
pseudoRef(fa, snpdt, sidx = 5:ncol(snpdt), arules = NULL, outdir)

Arguments

fa

Path for the reference fasta file. [string or DNAStringSet/DNAString object]

snpdt

A data.table object with heterozygote SNPs coded with IUPAC ambiguity codes. [data.table, 4 required columns: chr, pos, ref, alt, (sample1, ..., sampleN)]

sidx

A vector to indicate the sample columns. [vector, default=5:ncol(snpdt)].

arules

Additional nucleotide substitution rules defined by users. [data.frame, 2 required columns: from, to, default=NULL] For example, arules <- data.frame(from=c("M", "Y", "R", "K"), to=c("C", "C", "G", "T")).

outdir

Output directory. Sample specific sub-folders will be created. [string]

Value

A list of summary statistics of subsituted nucleotides. [list].

Examples

1
2
3
4
5
6
7
8
# First of all, use BCFtools to convert VCF into IUPAC coded data.table:

# bcftools view JRI20_filtered_snps_annot.bcf.gz -m2 -M2 -v snps -Oz -o JRI20_bi_snps_annot.vcf.gz
# bcftools query -f '%CHROM\t%POS\t%REF\t%ALT[\t%IUPACGT]\n' JRI20_bi_snps_annot.vcf.gz > JRI20_bi_snps_annot.txt
# bcftools query -f 'chr\tpos\tref\talt[\t%SAMPLE]\n' JRI20_bi_snps_annot.vcf.gz > JRI20_bi_snps_annot.header

arules <- data.frame(from=c("M", "Y", "R", "K"), to=c("C", "C", "G", "T"))
res <- pseudoRef(fa, snpdt, sidx=5:24, arules, outdir)

yangjl/pseudoRef documentation built on March 30, 2020, 7:47 p.m.