amPairwise: Pairwise matching of multilocus genotypes

View source: R/allelematch.r

amPairwiseR Documentation

Pairwise matching of multilocus genotypes

Description

Functions to perform a pairwise matching analysis of a multilocus genotype dataset, and review the output in formatted text or HTML. For each genotype in the focal dataset all genotypes in the comparison genotype are returned that match at or above a threshold matching score. The matching score is also known as the s-hat criterion (see the supplementary documentation). This is determined using amMatrix.

Usage

	amPairwise(
		amDatasetFocal, 
		amDatasetComparison = amDatasetFocal, 
		alleleMismatch = NULL, 
		matchThreshold = NULL, 
		missingMethod = 2
		)

	amHTML.amPairwise(
		x, 
		htmlFile = NULL, 
		htmlCSS = amCSSForHTML()
		)

	amCSV.amPairwise(
		x, 
		csvFile
		)

	## S3 method for class 'amPairwise'
summary(object, html = NULL, csv = NULL, ...)

Arguments

amDatasetFocal

An amDataset object containing focal genotypes.

amDatasetComparison

Optional.
An amDataset object containing comparison genotypes.
If not supplied, the focal dataset is also the comparison dataset (i.e., all focal dataset members are compared against one another).

alleleMismatch

Maximum number of mismatching alleles which will be tolerated when identifying individuals; also known as m-hat parameter.
If specified, then matchThreshold should be omitted.

matchThreshold

Return comparison genotypes that match with the focal genotype at or above this score or similarity; also known as s-hat parameter.

missingMethod

Method used to determine the similarity of multilocus genotypes when data is missing.
The default, (missingMethod = 2), is preferable in all cases.
See amMatrix.

object, x

An amPairwise object.

htmlFile

HTML filepath to create.
If htmlFile = NULL, a file is created in the operating system temporary directory and is then opened in the default browser.

htmlCSS

A string containing a valid cascading style sheet.
A default style sheet is provided in amCSSForHTML.
See amCSSForHTML for details of how to tweak this CSS.

html

If html = NULL or html=FALSE, formatted textual output is displayed on the console.
If html = TRUE, the summary method produces and loads an HTML file in the default browser.
html can also contain a path to a file where HTML output will be written.

csvFile, csv

CSV filepath to create containing giving a data frame representation of the pairwise matching results.

...

Additional arguments to summary.

Details

Pairwise matching of genotypes is a useful means to assess data quality and inspect for genotyping errors.

matchThreshold represents the similarity between two multilocus genotypes and can be thought of as a percentage similarity (or a Hamming's distance between two vectors) that has been corrected where missing data is present, such that missing data represents neither a match nor a mismatch but a "partial" match. See amMatrix for more discussion of this metric.

Value

amPairwise object or side effects: analysis summary written to an HTML file or to the console, or written to a CSV file.

Note

As matchThreshold is lowered, the size of the output increases rapidly. Typically analyses will not be very useful or manageable with thresholds below 0.7.

There is an additional side effect of html = TRUE (or of htmlFile = NULL). If required, there is a clean up of the operating system temporary directory where AlleleMatch temporary HTML files are stored. Files that match the pattern am*.html and are older 24 hours are deleted from this temporary directory.

Author(s)

Paul Galpern (pgalpern@gmail.com)

References

For a complete vignette, please access via the Data S1 Supplementary documentation and tutorials (PDF) located at <doi:10.1111/j.1755-0998.2012.03137.x>.

See Also

amDataset, amMatrix, amUnique

Examples

	## Not run: 
	data("amExample5")

	## Produce amDataset object
	myDataset <- 
		amDataset(
			amExample5, 
			missingCode = "-99", 
			indexColumn = 1, 
			metaDataColumn = 2, 
			ignoreColumn = "gender"
			)

	## Typical usage
	myPairwise <- 
		amPairwise(
			myDataset, 
			alleleMismatch = 2
			)

	## Display analysis as HTML in default browser
	summary(
		myPairwise, 
		html = TRUE
		)

	## Save analysis to HTML file
	summary(
		myPairwise, 
		html = "myPairwise.htm"
		)

	## Save analysis to CSV file
	summary(
		myPairwise, 
		csv = "myPairwise.csv"
		)

	## Display analysis as formatted text on the console
	summary(myPairwise)

	## Compare one dataset against a second
	## Both must have same number of allele columns
	## Here we create two datasets artificially from one for illustration purposes
	myDatasetA <- 
		amDataset(
			amExample5[sample(nrow(amExample5))[1:25], ], 
			missingCode = "-99", 
			indexColumn = 1, 
			ignoreColumn = 2
			)
	myDatasetB <- 
		amDataset(
			amExample5[sample(nrow(amExample5))[1:100], ], 
			missingCode = "-99", 
			indexColumn = 1, 
			ignoreColumn = 2
			)
	myPairwise2 <- 
		amPairwise(
			myDatasetA, 
			myDatasetB, 
			alleleMismatch = 3
			)
	summary(
		myPairwise2, 
		html = TRUE
		)
	
## End(Not run)

allelematch documentation built on Aug. 24, 2023, 5:06 p.m.