RankSNPsLRT: Rank the SNPs based on the likelihood ratio test.

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/OriGen-internal.R

Description

This function ranks the SNPs based on the likelihood ratio test comparing the data grouped into the different sample sites as inputted vs one large sample including all of the sites. To convert the data see ConvertPEDData.

Usage

1
RankSNPsLRT(DataArray)

Arguments

DataArray

An array giving the number of major/minor SNPs (defined as the most occuring in the dataset) grouped by sample sites for each SNP. The dimension of this array is [2,SampleSites,NumberSNPs].

Value

List with the following components:

DataArray

An array giving the number of major/minor SNPs (defined as the most occuring in the dataset) grouped by sample sites for each SNP. The dimension of this array is [2,SampleSites,NumberSNPs].

SampleSites

This shows the integer number of sample sites found.

NumberSNPs

This shows the integer number of SNPs found.

Rankings

An integer valued vector giving the LRT based ranking of each SNP. This can be used to reduce the number of SNPs to use for assignment if analysis takes too long.

LRT

This is a real valued array giving the Likelihood Ratio test statistic and the informativeness for assignment(Rosenberg) for each SNP. The dimension of this array is [2,NumberSNPs].

Author(s)

John Michael Ranola, John Novembre, and Kenneth Lange

References

Ranola J, Novembre J, Lange K (2014) Fast Spatial Ancestry via Flexible Allele Frequency Surfaces. Bioinformatics, in press.

See Also

ConvertPEDData for converting Plink PED files into a format appropriate for analysis,

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#Data generation
SampleSites=25
NumberSNPs=10
TestData=array(sample(2*(1:30),2*SampleSites*NumberSNPs,replace=TRUE),
	dim=c(2,SampleSites,NumberSNPs))
#Europe is about -9 to 38 and 34 to 60
TestCoordinates=array(0,dim=c(SampleSites,2))
TestCoordinates[,1]=runif(SampleSites,-9,38)
TestCoordinates[,2]=runif(SampleSites,34,60)

#This code simulates the number of major alleles the unknown individuals have.
NumberUnknowns=2
TestUnknowns=array(sample(0:2,NumberUnknowns*NumberSNPs,replace=TRUE),
	dim=c(NumberUnknowns,NumberSNPs))

#Rank the SNPs
trials7=RankSNPsLRT(TestData)
trials7

OriGen documentation built on May 2, 2019, 6:39 a.m.