Description Usage Arguments Value See Also Examples
getGenotypes()
performs pre-analysis data processing of PLINK formatted unphased genotype data,
including removal of SNPs and isolates with high proportions of missing data and SNPs with low minor
allele frequencies. It also calculates SNP allele frequencies from either the
input dataset or a specified reference dataset.
1 2 3 | getGenotypes(ped.map, reference.ped.map = NULL, maf = 0.01,
isolate.max.missing = 0.1, snp.max.missing = 0.1, chromosomes = NULL,
input.map.distance = "cM", reference.map.distance = "cM")
|
ped.map |
A list with 2 objects:
|
reference.ped.map |
An optional list containing reference data used to calculate SNP allele
frequencies. The list has 2 objects in the same format as |
maf |
A numeric value denoting the smallest minor allele frequency allowed in the analysis. The default value is 0.01. |
isolate.max.missing |
A numeric value denoting the maximum proportion of missing data allowed for each isolate. The default value is 0.1. |
snp.max.missing |
A numeric value denoting the maximum proportion of missing data allowed for each SNP. The default value is 0.1. |
chromosomes |
A vector containing a subset of chromosomes to perform formatting on. The
default value is |
input.map.distance |
A character string of either "M" or "cM" denoting whether the genetic map distances in the input MAP data frame are in Morgans (M) or centi-Morgans (cM). The default is cM. |
reference.map.distance |
A character string of either "M" or "cM" denoting whether the genetic map distances in the reference MAP data frame are in Morgans (M) or centi-Morgans (cM). The default is cM. |
A list of two objects named pedigree
and genotypes
:
A pedigree containing the isolates that remain after filtering.
The pedigree is the first six columns of the PED file and these columns are headed fid, iid, pid, mid, moi
and aff
respectively.
A data frame with the first five columns:
Chromosome (type "character"
, "numeric"
or "integer"
)
SNP identifiers (type "character"
)
Genetic map distance (Morgans, M) (type "numeric"
)
Base-pair position (type "integer"
)
Population allele frequency (type "integer"
)
where each row describes a single SNP. These columns are headed chr, snp_id, pos_M, pos_bp
and freq
respectively.
Columns 6 onwards contain the genotype data for each isolate, where a single column corresponds to a single isolate. These columns are
labeled with merged family IDs and isolate IDs separated by a slash symbol (/).
getIBDparameters
and getIBDsegments
.
1 2 3 4 5 6 7 8 9 10 11 12 | # take a look at the data
str(png_pedmap)
# reformat and filter to call genotypes
my_genotypes <- getGenotypes(ped.map = png_pedmap,
reference.ped.map = NULL,
maf = 0.01,
isolate.max.missing = 0.1,
snp.max.missing = 0.1,
chromosomes = NULL,
input.map.distance = "cM",
reference.map.distance = "cM")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.