Description Usage Arguments Details Value Standardization File-based method See Also
Calculation of column-wise, row-wise, and matrix-wise correlations between two matrices, the "true" genotypes and the imputed genotypes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | imputation_accuracy(true, impute, standardized = TRUE, center = NULL,
scale = NULL, p = NULL, tol = 0.1, ...)
## S3 method for class 'character'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ..., ncol = NULL,
nlines = NULL, na = 9, adaptive = TRUE, excludeIDs = NULL,
excludeSNPs = NULL)
## S3 method for class 'matrix'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL, transpose = FALSE)
## S3 method for class 'haps'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, ...,
excludeIDs = NULL, excludeSNPs = NULL)
## S3 method for class 'vcfR'
imputation_accuracy(true, impute, standardized = TRUE,
center = NULL, scale = NULL, p = NULL, tol = 0.1, excludeIDs = NULL,
excludeSNPs = NULL, ...)
|
true |
True genotype matrix, or filename (AlphaImpute format only). |
impute |
Imputed genotype matrix, or filename (AlphaImpute format only). |
standardized |
Logical, whether to center and scale genotypes by dataset in |
center |
Numeric vector of |
scale |
Numeric vector of |
p |
Shortcut for |
tol |
Numeric, tolerance for imputation error when counting correctly imputed genotypes. |
... |
Arguments passed between different methods (mostly |
ncol |
Integer, number of SNP columns in files. When |
nlines |
Integer, number of lines in |
na |
Value of missing genotypes. |
adaptive |
Use adaptive method (default) that stores |
excludeIDs |
Integer vector, exclude these individuals from correlations. Does not affect calculation of column means and standard deviations. |
excludeSNPs |
Integer or logical vector, exclude these columns from correlations. Does not affect calculation of column means and standard deviations. |
transpose |
Logical, if SNPs are per row, set to |
Character class method uses files only, and arguments
true
and impute
refer to the filenames.
The method assumes first column in both files is an integer ID column and thus excluded from calculations.
Genotypes equal to na
are considered missing (i.e. NA
) and are not included in the calculations.
matrix class method performs same calculations, but on matrices stored
in memory. Class methods for format-specific objects ('haps', 'oxford', or 'vcfR'),
extracts SNP genotypes matrices using extract.snps
.
Correlations are only performed on those rows that are found in both matrices / files, based on the first column (ID column).
List with following elements:
matcor
Matrix-wise correlation between true and imputed matrix.
snps
Data frame with all snp-wise statistics; has $m$ or $m - |excludeSNPs|$ rows.
animals
Data frame with all animal-wise statistics; has $n$ or $n - |excludeIDs|$ rows.
The data frames keeps all rows when used on files; when used on matrices, the rows of the corresponding dropped IDs or SNPs are dropped.
The data frames, snps
and animals
, with statistics consists of columns
rowID
Row ID ($animals
only!).
means
Value subtracted from each column ($snps
only!).
sds
Value used to scale each column (i.e. standard deviations) ($snps
only!).
cors
Pearson correlation between true and imputed genotype.
correct
Number of entries of equal value (within tol
)
true.na
Number of entries in that were missing in true
but not impute
.
imp.na
As true.na
, but vice versa.
both.na
Number of entries that were missing in both files.
correct.pct
correct
divided by total number of entries bare missing entries in true
.
Standardization is performed by subtracting the mean followed by
division of the standard deviation; conceptually the same as in
scale
.
Mean and standard deviation are calculated based on true
matrix,
before removing samples (excludeIDs
) or SNPs (excludeSNPs
).
Alternate means and scales may be provided by arguments
center
and scale
, or p
.
Note: If either scale
or p
are 0
or NA
, they
will not contribute to correlation, but they will count towards
correct pct. To exclude entirely, use excludeSNPs
.
This method stores the "true" matrix in memory with a low-precision real type,
and rows in the "imputed" matrix are read and matched by ID.
If there are no extra rows in either matrix and order of IDs is the same,
consider setting adaptive=FALSE
, as this has a memory usage of O(m), compared to O(nm) for the adaptive method, where 'm' is the number of SNPs and 'n' the number of animals.
The non-adaptive method is however, and very surprisingly, slightly slower.
write.snps
for writing SNPs to a file.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.