Description Usage Arguments Details Value Standardization File-based method See Also
Calculation of column-wise, row-wise, and matrix-wise correlations between two matrices, the "true" genotypes and the imputed genotypes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | imputation_accuracy(true, impute, standardized = TRUE, center = NULL,
scale = NULL, p = NULL, tol = 0.1, ...)
## S3 method for class 'character'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ..., ncol = NULL,
nlines = NULL, na = 9, adaptive = TRUE, excludeIDs = NULL,
excludeSNPs = NULL)
## S3 method for class 'matrix'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL, transpose = FALSE)
## S3 method for class 'haps'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL)
## S3 method for class 'vcfR'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, excludeIDs = NULL,
excludeSNPs = NULL, ...)
|
true |
True genotype matrix, or filename (AlphaImpute format only). |
impute |
Imputed genotype matrix, or filename (AlphaImpute format only). |
standardized |
Logical, whether to center and scale genotypes by dataset in |
center |
Numeric vector of |
scale |
Numeric vector of |
p |
Shortcut for |
tol |
Numeric, tolerance for imputation error when counting correctly imputed genotypes. |
... |
Arguments passed between different methods (mostly |
ncol |
Integer, number of SNP columns in files. When |
nlines |
Integer, number of lines in |
na |
Value of missing genotypes. |
adaptive |
Use adaptive method (default) that stores |
excludeIDs |
Integer vector, exclude these individuals from correlations. Does not affect calculation of column means and standard deviations. |
excludeSNPs |
Integer or logical vector, exclude these columns from correlations. Does not affect calculation of column means and standard deviations. |
transpose |
Logical, if SNPs are per row, set to |
Character class method uses files only, and arguments
true and impute refer to the filenames.
The method assumes first column in both files is an integer ID column and thus excluded from calculations.
Genotypes equal to na are considered missing (i.e. NA) and are not included in the calculations.
matrix class method performs same calculations, but on matrices stored
in memory. Class methods for format-specific objects ('haps', 'oxford', or 'vcfR'),
extracts SNP genotypes matrices using extract.snps.
Correlations are only performed on those rows that are found in both matrices / files, based on the first column (ID column).
List with following elements:
matcorMatrix-wise correlation between true and imputed matrix.
snpsData frame with all snp-wise statistics; has $m$ or $m - |excludeSNPs|$ rows.
animalsData frame with all animal-wise statistics; has $n$ or $n - |excludeIDs|$ rows.
The data frames keeps all rows when used on files; when used on matrices, the rows of the corresponding dropped IDs or SNPs are dropped.
The data frames, snps and animals, with statistics consists of columns
rowIDRow ID ($animals only!).
meansValue subtracted from each column ($snps only!).
sdsValue used to scale each column (i.e. standard deviations) ($snps only!).
corsPearson correlation between true and imputed genotype.
correctNumber of entries of equal value (within tol)
true.naNumber of entries in that were missing in true but not impute.
imp.naAs true.na, but vice versa.
both.naNumber of entries that were missing in both files.
correct.pctcorrect divided by total number of entries bare missing entries in true.
Standardization is performed by subtracting the mean followed by
division of the standard deviation; conceptually the same as in
scale.
Mean and standard deviation are calculated based on true matrix,
before removing samples (excludeIDs) or SNPs (excludeSNPs).
Alternate means and scales may be provided by arguments
center and scale, or p.
Note: If either scale or p are 0 or NA, they
will not contribute to correlation, but they will count towards
correct pct. To exclude entirely, use excludeSNPs.
This method stores the "true" matrix in memory with a low-precision real type,
and rows in the "imputed" matrix are read and matched by ID.
If there are no extra rows in either matrix and order of IDs is the same,
consider setting adaptive=FALSE, as this has a memory usage of O(m), compared to O(nm) for the adaptive method, where 'm' is the number of SNPs and 'n' the number of animals.
The non-adaptive method is however, and very surprisingly, slightly slower.
write.snps for writing SNPs to a file.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.