View source: R/calibrate_alleles.R
calibrate_alleles | R Documentation |
Calibrate REF and ALT alleles based on counts. The REF allele is designated as the allele with more counts in the dataset. The function will generate a REF and ALT columns.
reference genome: for people using a reference genome, the reference allele terminology is different and is not based on counts...
Used internally in radiator and might be of interest for users.
calibrate_alleles(
data,
biallelic = NULL,
parallel.core = parallel::detectCores() - 1,
verbose = FALSE,
...
)
data |
A genomic data set in the global environment tidy formats. See details for more info. |
biallelic |
(optional) If |
parallel.core |
(optional) The number of core used for parallel
execution. This is no longer used. The code is as fast as it can. Using
more cores will reduce the speed.
Default: |
verbose |
(optional, logical) |
... |
(optional) To pass further argument for fine-tuning the tidying (details below). |
Input data: A minimum of 4 columns are required (the rest are considered metata info):
MARKERS
POP_ID
INDIVIDUALS
GT
and/or GT_VCF_NUC
and/or GT_VCF
How to get a tidy data frame ?
radiator tidy_genomic_data
Depending if the input file is biallelic or multiallelic, the function will output additional to REF and ALT column several genotype codings:
GT
: the genotype in 6 digits format with integers.
GT_VCF
: the genotype in VCF format with integers.
GT_VCF_NUC
: the genotype in VCF format with letters corresponding to nucleotide.
GT_BIN
: biallelic coding similar to PLINK,
the coding 0, 1, 2, NA
correspond to the number of ALT allele in the
genotype and NA
for missing genotypes.
Thierry Gosselin thierrygosselin@icloud.com
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.