RankSNPsLRT: Rank the SNPs based on the likelihood ratio test.

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/OriGen-internal.R

Description

This function ranks the SNPs based on the likelihood ratio test comparing the data grouped into the different sample sites as inputted vs one large sample including all of the sites. For convenience it also calculates the informativeness for assignment. To convert the data see ConvertPEDData.

Usage

1
RankSNPsLRT(DataArray)

Arguments

DataArray

An array giving the number of major/minor SNPs (defined as the most occuring in the dataset) grouped by sample sites for each SNP. The dimension of this array is [2,SampleSites,NumberSNPs].

Value

List with the following components:

RankedSNPs

An integer valued vector giving the positions in order of LRT based ranking of each SNP. In other words, the first number is the SNP position with the lowest ranking. This is NOT the ranking of the first SNP. This can be used to reduce the number of SNPs to use for assignment if analysis takes too long.

LRT

This is a real valued array giving the Likelihood Ratio test statistic for each SNP.

Informativeness

This is a real valued array giving the informativeness for assignment(Rosenberg) for each SNP.

SampleSites

This shows the integer number of sample sites found.

NumberSNPs

This shows the integer number of SNPs found.

Author(s)

John Michael Ranola, John Novembre, and Kenneth Lange

References

Ranola J, Novembre J, Lange K (2014) Fast Spatial Ancestry via Flexible Allele Frequency Surfaces. Bioinformatics, in press.

See Also

ConvertPEDData for converting Plink PED files into a format appropriate for analysis,

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#Data generation
SampleSites=25
NumberSNPs=10
TestData=array(sample(2*(1:30),2*SampleSites*NumberSNPs,replace=TRUE),
	dim=c(2,SampleSites,NumberSNPs))
#Europe is about -9 to 38 and 34 to 60
TestCoordinates=array(0,dim=c(SampleSites,2))
TestCoordinates[,1]=runif(SampleSites,-9,38)
TestCoordinates[,2]=runif(SampleSites,34,60)

#This code simulates the number of major alleles the unknown individuals have.
NumberUnknowns=2
TestUnknowns=array(sample(0:2,NumberUnknowns*NumberSNPs,replace=TRUE),
	dim=c(NumberUnknowns,NumberSNPs))

#Rank the SNPs
trials7=RankSNPsLRT(TestData)
trials7

OriGen documentation built on Sept. 19, 2020, 3:01 p.m.