View source: R/standardization.R
get_centers | R Documentation |
This function estimates the cluster centers for each genotype dosage class based on the 'theta' values (e.g., allelic ratios or normalized signal intensities). It supports imputing missing clusters and optionally removing outliers.
get_centers(
ratio_geno,
ploidy,
n.clusters.thr = NULL,
type = c("intensities", "counts"),
rm_outlier = TRUE,
cluster_median = TRUE
)
ratio_geno |
A data.frame containing the following columns: - 'MarkerName': Identifier for each marker. - 'SampleName': Identifier for each sample. - 'theta': Numeric variable representing allelic ratio or signal intensity. - 'geno': Integer dosage (e.g., 0, 1, 2 for diploids). |
ploidy |
Integer specifying the organism ploidy (e.g., 2 for diploid). |
n.clusters.thr |
Integer specifying the minimum number of genotype clusters required for a marker to be retained. If fewer clusters are found, missing ones can be imputed depending on the 'type'. Defaults to 'ploidy + 1' if 'NULL'. |
type |
Character string indicating the data source type: - '"intensities"': For array-based allele intensities. - '"counts"': For sequencing read counts. Default is '"intensities"'. |
rm_outlier |
Logical; if 'TRUE', outlier samples within genotype clusters will be identified and removed prior to center calculation (default: 'TRUE'). |
cluster_median |
Logical; if 'TRUE', cluster centers are calculated using the median of 'theta' values. If 'FALSE', the mean is used (default: 'TRUE'). |
A named list with the following elements: - 'rm': Integer flag: '0' (retained), '1' (no clusters found), or '2' (too few clusters). - 'centers_theta': A numeric vector of cluster center positions on the theta scale. - 'MarkerName': Marker identifier. - 'n.clusters': Number of clusters (including imputed ones if applicable).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.