Description Usage Arguments Details Value Standardization Filebased method See Also
View source: R/imputation_accuracy.R
Calculation of columnwise, rowwise, and matrixwise correlations between two matrices, the "true" genotypes and the imputed genotypes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  imputation_accuracy(true, impute, standardized = TRUE, center = NULL,
scale = NULL, p = NULL, tol = 0.1, ...)
## S3 method for class 'character'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ..., ncol = NULL,
nlines = NULL, na = 9, adaptive = TRUE, excludeIDs = NULL,
excludeSNPs = NULL)
## S3 method for class 'matrix'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL, transpose = FALSE)
## S3 method for class 'haps'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL)
## S3 method for class 'vcfR'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, excludeIDs = NULL,
excludeSNPs = NULL, ...)

true 
True genotype matrix, or filename (AlphaImpute format only). 
impute 
Imputed genotype matrix, or filename (AlphaImpute format only). 
standardized 
Logical, whether to center and scale genotypes by dataset in 
center 
Numeric vector of 
scale 
Numeric vector of 
p 
Shortcut for 
tol 
Numeric, tolerance for imputation error when counting correctly imputed genotypes. 
... 
Arguments passed between different methods (mostly 
ncol 
Integer, number of SNP columns in files. When 
nlines 
Integer, number of lines in 
na 
Value of missing genotypes. 
adaptive 
Use adaptive method (default) that stores 
excludeIDs 
Integer vector, exclude these individuals from correlations. Does not affect calculation of column means and standard deviations. 
excludeSNPs 
Integer or logical vector, exclude these columns from correlations. Does not affect calculation of column means and standard deviations. 
transpose 
Logical, if SNPs are per row, set to 
Character class method uses files only, and arguments
true
and impute
refer to the filenames.
The method assumes first column in both files is an integer ID column and thus excluded from calculations.
Genotypes equal to na
are considered missing (i.e. NA
) and are not included in the calculations.
matrix class method performs same calculations, but on matrices stored
in memory. Class methods for formatspecific objects ('haps', 'oxford', or 'vcfR'),
extracts SNP genotypes matrices using extract.snps
.
Correlations are only performed on those rows that are found in both matrices / files, based on the first column (ID column).
List with following elements:
matcor
Matrixwise correlation between true and imputed matrix.
snps
Data frame with all snpwise statistics; has $m$ or $m  excludeSNPs$ rows.
animals
Data frame with all animalwise statistics; has $n$ or $n  excludeIDs$ rows.
The data frames keeps all rows when used on files; when used on matrices, the rows of the corresponding dropped IDs or SNPs are dropped.
The data frames, snps
and animals
, with statistics consists of columns
rowID
Row ID ($animals
only!).
means
Value subtracted from each column ($snps
only!).
sds
Value used to scale each column (i.e. standard deviations) ($snps
only!).
cors
Pearson correlation between true and imputed genotype.
correct
Number of entries of equal value (within tol
)
true.na
Number of entries in that were missing in true
but not impute
.
imp.na
As true.na
, but vice versa.
both.na
Number of entries that were missing in both files.
correct.pct
correct
divided by total number of entries bare missing entries in true
.
Standardization is performed by subtracting the mean followed by
division of the standard deviation; conceptually the same as in
scale
.
Mean and standard deviation are calculated based on true
matrix,
before removing samples (excludeIDs
) or SNPs (excludeSNPs
).
Alternate means and scales may be provided by arguments
center
and scale
, or p
.
Note: If either scale
or p
are 0
or NA
, they
will not contribute to correlation, but they will count towards
correct pct. To exclude entirely, use excludeSNPs
.
This method stores the "true" matrix in memory with a lowprecision real type,
and rows in the "imputed" matrix are read and matched by ID.
If there are no extra rows in either matrix and order of IDs is the same,
consider setting adaptive=FALSE
, as this has a memory usage of O(m), compared to O(nm) for the adaptive method, where 'm' is the number of SNPs and 'n' the number of animals.
The nonadaptive method is however, and very surprisingly, slightly slower.
write.snps
for writing SNPs to a file.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.