format_sequencing: format_sequencing

Description Usage Arguments

View source: R/estimate_ethnicity.R

Description

format_sequencing

Usage

1
2
format_sequencing(cohort_name, input_vcfs, output_directory, ref1kg_vcfs,
  ref1kg_maf, recode, vcf_half_call, bin_path)

Arguments

cohort_name

A character. A name to describe the studied population compared to 1,000 Genomes.

input_vcfs

A character. A path to one or several VCFs file.

output_directory

A character. The path where the data and figures is written.

ref1kg_vcfs

A character. A path to the reference VCFs files (i.e., 1,000 Genomes sequencing data).

ref1kg_maf

A numeric. MAF threshold for SNPs in 1,000 Genomes

recode

A character. Which VCF should be filtered and recode, either "all" or "input".

vcf_half_call

A character. The mode to handle half-call. + 'haploid'/'h': Treat half-calls as haploid/homozygous (the PLINK 1 file format does not distinguish between the two). This maximizes similarity between the VCF and BCF2 parsers. + 'missing'/'m': Treat half-calls as missing (default). + 'reference'/'r': Treat the missing part as reference.

bin_path

A list(character). A list giving the binary path of vcftools, bcftools, bgzip, tabix and plink1.9.


mcanouil/CARoT documentation built on Oct. 17, 2019, 4:36 p.m.