VCFtoRA: Convert VCF file into reference/alternative (RA) file.

View source: R/VCFtoRA.R

VCFtoRAR Documentation

Convert VCF file into reference/alternative (RA) file.

Description

Function for converting a VCF file into RA format.

Usage

VCFtoRA(infilename, direct = "./", makePed = F)

Arguments

infilename

String giving the filename of the VCF file to be converted to RA format.

direct

String of the directory (or relative to the working direct) where the RA file is to be written. Defaults to current working directory.

makePed

A logical value. If TRUE, a pedigree file is initialized.

Details

The VCF files must contain some information regarding allelic depth. Currently, the function can use one of the following fields (or group of fields) in a VCF file:

  • AD field

  • AO and RO fields

  • DP4 field

Information regarding VCF files and their format can be found at the samtools GitHub page.

RA format is a tab-delimited with columns, CHROM, POS, SAMPLES where SAMPLES consists of sampleIDs, which typically (but not necessarily need to) consist of a colon-delimited sampleID, flowcellID, lane, seqlibID. e.g.,

CHROM POS 999220:C4TWKACXX:7:56 999204:C4TWKACXX:7:56
1 415 5,0 0,3
1 443 1,0 4,4
1 448 0,0 0,2

Note: Indels are removed, multiple alternative alleles are removed and ./. is translated into 0,0.

The format of the pedigree files is a csv file with the following columns.

  • SampleID: A unique character string of the sample ID. These must correspond to those found in the VCF file.

  • IndividualID: A character giving the ID number of the individual for which the sample corresponds to. Note that some samples can be from the same individual.

  • Mother: The ID of the mother as given in the IndividualID. Note, if the mother is unknown then this should be left blank.

  • Father: The ID of the father as given in the IndividualID. Note, if the father is unknown then this should be left blank.

  • Family: The name of the Family for a group of progeny with the same parents. Note that this is not necessary but if given must be the same for all the progeny.

Value

A string of the complete file path and name of the RA file created from the function. In addition to creating a RA file, a pedigree file is also initialized in the same folder as the RA file if specified and the pedigree does not already exist.

Author(s)

Timothy P. Bilton. Adapted from a Python script written by Rudiger Brauning and Rachael Ashby.

See Also

readRA

Examples

file <- simDS()
RAfile <- VCFtoRA(file$vcf)

tpbilton/GUSbase documentation built on March 8, 2024, 1:35 p.m.