impute.str: Impute STR genotypes from surrounding SNPs and extract...

Description Usage Arguments Details Value See Also

View source: R/imput_str.R

Description

This function imputes genotypes of individuals at a STR marker locus from its surrounding SNPs snp.f using a reference panel ref.f and a genetic map map.f.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
impute.str(
  snp.f,
  ref.f,
  map.f,
  marker,
  nthreads = 1,
  niterations = 10,
  maxlr = 1e+06,
  lowmem = "false",
  window = 50000,
  overlap = 3000,
  cluster = 0.005,
  ne = 1e+06,
  err = 1e-04,
  seed = -99999,
  modelscale = 0.8
)

Arguments

snp.f

A name (including full path) of a file containing SNPs. Must be a standard vcf format.

ref.f

A name (including full path) of a file containing SNP-STR haplotypes to be used as a reference panel for imputation. Note that the reference panel must be phased and have no missing values.

map.f

A name (including full path) of a file genetic map in plink format.

marker

Name of a STR locus to be imputed.

nthreads

(positive integer) Number of threads for BEAGLE, default=1.

niterations

(positive integer) Number of phasing iterations in BEAGLE, default=10.

maxlr

(number ≥ 1) The maximum likelihood ratio at a genotype, default=1000000.

lowmem

(true/false) Whether a memory efficient algorithm should be used, default=false.

window

(positive integer) The number of markers to include in each sliding window, default=50000.

overlap

(positive integer) The number of markers of overlap between sliding windows, default=3000.

cluster

(non-negative number) The maximum cM distance between individual markers that are combined into an aggregate marker when imputing ungenotyped markers, default=0.005.

ne

(integer) The effective population size when imputing ungenotyped markers, default=1000000.

err

(nonnegative number) The allele miscall rate, default=0.0001.

seed

(integer) The seed for the random number generator, default=-99999.

modelscale

(positive number) the model scale parameter when sampling haplotypes for unrelated individuals, default=0.8.

Details

This function imputes genotypes of individuals at a STR marker locus from its surrounding SNPs snp.f using a reference panel ref.f and a genetic map map.f. The reference panel must be phased and have no missing values. The imputed and phased SNP-STR vcf file is stored at (base.dir)/imputed_str where the base.dir was set by running setup in the beginning of the pipeline. From the imputed SNP-STR vcf, the function extracts imputed genotypes and genotype probabilities at the STR marker locus and save it in the (base.dir)/imputed_str folder. The output file name containing imputed genotypes probabilities is imp_str_(marker).GP.FORMAT.

Value

None

See Also

BEAGLE manual. vcftools manual.


jk2236/RecordMatching documentation built on Dec. 21, 2021, 12:10 a.m.