format_genotypes: Format genotype calls

format_genotypesR Documentation

Format genotype calls

Description

Format genotype calls

Usage

format_genotypes(
  genotypes,
  vcf = FALSE,
  vcfName,
  GP_cutoff = 0.9,
  outlier_cutoff = "max",
  missing_cutoff = 0.1,
  R2_cutoff_up = 1.1,
  R2_cutoff_down = 0.75,
  MAF_cutoff = 0.01,
  HWE_cutoff = 1e-06,
  pop = "ALL",
  type,
  plotAF = FALSE,
  platform = "EPIC"
)

Arguments

genotypes

Genotype calls.

vcf

If TRUE, will write a VCF file in the current directory.

vcfName

VCF file name. Only effective when vcf=TRUE.

GP_cutoff

When calculating missing rate, genotypes with the highest genotype probability < GP_cutoff will be treated as missing.

outlier_cutoff

"max" or a number ranging from 0 to 1. If outlier_cutoff="max", genotypes with outlier probability larger than all of the three genotype probabilities will be set as missing. If outlier_cutoff is a number, genotypes with outlier probability > outlier_cutoff will be set as missing.

missing_cutoff

Missing rate cutoff to filter variants. Note that for VCF output, variants with missing rate above the cutoff will be marked in the FILTER column. For the returned dosage matrix, variants with missing rate above the cutoff will be removed.

R2_cutoff_up, R2_cutoff_down

R-square cutoffs to filter variants (Variants with R-square > R2_cutoff_up or < R2_cutoff_down should be removed). Note that for VCF output, variants with R-square outside this range will be marked in the FILTER column. For the returned dosage matrix, variants with R-square outside this range will be removed.

MAF_cutoff

MAF cutoff to filter variants. Note that for VCF output, variants with MAF below the cutoff will be marked in the FILTER column. For the returned dosage matrix, variants with MAF below the cutoff will be removed.

HWE_cutoff

HWE p value cutoff to filter variants. Note that for VCF output, variants with HWE p value below the cutoff will be marked in the FILTER column. For the returned dosage matrix, variants with HWE p value below the cutoff will be removed.

pop

Population to be used to extract AFs. One of EAS, AMR, AFR, EUR, SAS, and ALL.

type

One of snp_probe, typeI_probe, and typeII_probe.

plotAF

To plot the distribution of AFs in 1KGP and input data.

platform

EPIC or 450K.

Value

A matrix of genotype calls. Variants with R2s, HWE p values, MAFs, or missing rates beyond the cutoffs are removed.


Yi-Jiang/MethyGenotyper documentation built on Sept. 4, 2024, 12:47 p.m.