View source: R/filter_genotypes.R
filter_genos | R Documentation |
Filters a hapmap or genotype matrix based on user-defined limits of SNP missingness, SNP minor allele frequency, SNP heterozygosity, entry missingness, and entry heterozygosity.
filter_genos(
genos,
min.maf = 0,
max.mar.missing = 1,
max.entry.missing = 1,
max.mar.het = 1,
max.entry.het = 1,
print.plot = FALSE,
verbose = TRUE
)
min.maf |
The minimum minor allele frequency cutoff to keep a SNP. |
x |
A hapmap or genotype matrix. Heuristics are used to determine the format.
Both may be encoded using TASSEL or rrBLUP standards. See |
min.snp.missing |
The maxmimum missingness proportion |
encoding |
The desired output encoding. Either |
The TASSEL format is as such: The first row is column names. The first 4 columns are marker name, alleles, chromosome, and position, respectively. The next 7 column are additional information for TASSEL. The remaining columns are samples. Genotypes are encoded in diploid format (i.e. AA, AC, CC) with "NN" denoting missing data.
The rrBLUP format is as such:
The first row is column names. The first 4 columns are marker name, alleles,
chromosome, and position, respectively. The next 7 column are additional information for
TASSEL. The remaining columns are samples. Genotypes are encoded in 1, 0, -1
format where 1 is homozygous for the first allele, 0 is heterozygous, and -1 is
homozygous for the second allele. Missing data is denoted with NA
.
A data.frame
of a hapmap encoded in the designated format.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.