vcfR_to_fasta: Conversion of vcf to fasta Format

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/vcfR_to_fasta.R

Description

This function enables to read vcfR files and convert them to necessary fasta files. Therefore, we recommend to provide a reference sequence from e.g. genome browser and the starting position. The default parameters are those of the vcfR package.

Usage

1
2
3
vcfR_to_fasta(seqName, refName = NULL, ext.ind = T, cons = F,
              ext.haps = T, start = NULL, ref= NULL, fa_start = NULL,
              fa_end = NULL, attr_name = NULL)

Arguments

seqName

A character string containing the full path and the name of the sequence file. It is necessary to add the extension in order to run LDJump (seqName = "fileName.vcf").

refName

An (optional) full path including file name and extension (".vcf") to the reference sequence for the region of interest downloaded from e.g. http://phase3browser.1000genomes.org/index.html. Only to be used in case that format == "vcf".

ext.ind

See package vcfR for details (vcfR2DNAbin, extract.indels)

cons

See package vcfR for details (vcfR2DNAbin, consensus)

ext.haps

See package vcfR for details (vcfR2DNAbin, extract.haps)

start

An (optional) integer value which reflects the starting position of the sequences in bp. Only to be used in case that format == "vcf".

ref

A character string describing the name of the reference sequence. If the working directory is not set to the location of the file, the complete path to the file has to be provided g.e. ref = "/home/LDJump/refseq.fa". The reference sequence is needed as it is used together with the vcfR-package to convert each VCF-segment into a FASTA-file.

fa_start

An integer value used to subset the reference sequence when converting VCF-segments to FASTA. It doesn't have to be provided in the function call, but rather it is initialized and computed inside the function vcf_statistics.

fa_end

An integer value used to subset the reference sequence when converting VCF-segments to FASTA. It doesn't have to be provided in the function call, but rather it is initialized and computed inside the function vcf_statistics.

attr_name

A character string describing the chromosome number of the reference file. For example, we have a FASTA-header ">21 dna:chromosome:GRCh37:21:41000000:41010000:1" in our reference file, which describes our file to be a segment of chromosome 21, ranging from 41000000 to 41010000. In vcf_statistics, we use this information to retrieve the chromosome number "21" for the conversion step.

Value

A print command provides information that the file is converted.

Author(s)

Philipp Hermann philipp.hermann@jku.at, Andreas Futschik, Fardokhtsadat Mohammadi fardokht.fm@gmail.com

References

Knaus BJ and Grünwald NJ (2017). VCFR: a package to manipulate and visualize variant call format data in R. Molecular Ecology Resources, 17(1), pp. 44-53. ISSN 757, <URL: http://dx.doi.org/10.1111/1755-0998.12549>.

See Also

LDJump, summary_statistics, getPhi, get_smuce, vcfR2DNAbin

Examples

1
2
3
##### Do not run these examples                                #####
##### vcfR_to_fasta (seqName, refName, ext.ind = T, cons = F,  #####
#####                ext.haps = T, start = 1)                  #####

PhHermann/LDJump documentation built on Nov. 16, 2019, 12:53 p.m.