format_sequencing: format_sequencing

View source: R/format_sequencing.R

format_sequencingR Documentation

format_sequencing

Description

format_sequencing

Usage

format_sequencing(
  cohort_name,
  input_vcfs,
  output_directory,
  ref1kg_vcfs,
  ref1kg_maf,
  recode,
  vcf_half_call,
  bin_path
)

Arguments

cohort_name

A character. A name to describe the studied population compared to 1,000 Genomes.

input_vcfs

A character. A path to one or several VCFs file.

output_directory

A character. The path where the data and figures is written.

ref1kg_vcfs

A character. A path to the reference VCFs files (i.e., 1,000 Genomes sequencing data).

ref1kg_maf

A numeric. MAF threshold for SNPs in 1,000 Genomes

recode

A character. Which VCF should be filtered and recode, either "all" or "input".

vcf_half_call

A character. The mode to handle half-call. + 'haploid'/'h': Treat half-calls as haploid/homozygous (the PLINK 1 file format does not distinguish between the two). This maximizes similarity between the VCF and BCF2 parsers. + 'missing'/'m': Treat half-calls as missing (default). + 'reference'/'r': Treat the missing part as reference.

bin_path

A list(character). A list giving the binary path of vcftools, bcftools, bgzip, tabix and plink.


mcanouil/rain documentation built on Nov. 28, 2022, 10:40 a.m.