Description Usage Arguments Value Author(s) References See Also Examples
View source: R/OriGen-internal.R
This function converts two Plink PED/MAP files (one for the known samples and one with unknown locations) into the data format required for OriGen.
1 | ConvertUnknownPEDData(PlinkFileName,LocationFileName,PlinkUnknownFileName)
|
PlinkFileName |
Base name of Plink PED file (i.e. without ".ped" or ".map") containing the individuals with known locations. |
LocationFileName |
Space or tab delimited text file with Longitude and Latitude coordinates for each individual listed in the 4th and 5th columns respectively. Note that rows should correspond to the individuals in the Plink File. Also, this file should have a header row. |
PlinkUnknownFileName |
Base name of Plink PED file (i.e. without ".ped" or ".map") containing the individuals with unknown locations. |
List with the following components:
DataArray |
An array giving the number of major/minor SNPs (defined as the most occuring in the dataset) grouped by sample sites for each SNP. The dimension of this array is [2,SampleSites,NumberSNPs]. |
SampleCoordinates |
This is an array which gives the longitude and latitude of each of the found sample sites. The dimension of this array is [SampleSites,2], where the second dimension represents longitude and latitude respectively. |
PlinkFileName |
This shows the inputted PlinkFileName with ".ped" attached. |
LocationFile |
This shows the inputted LocationFileName. |
SampleSites |
This shows the integer number of sample sites found. |
NumberSNPs |
This shows the integer number of SNPs found. |
UnknownPEDFile |
This shows the inputted PED file for the unknown individuals. |
NumberUnknowns |
This is an integer value showing the number of unknowns found in the UnknownPEDFile. |
UnknownData |
An array showing the unknown individuals genetic data. The dimension of this array is [NumberUnknowns,NumberSNPs]. |
Membership |
This is an integer valued vector showing the group number of each member of the inputted known group. The dimension of this array is [NumberKnown]. |
NumberKnown |
This is an integer value showing the number of known found in the PlinkFileName. |
John Michael Ranola, John Novembre, and Kenneth Lange
Ranola J, Novembre J, Lange K (2014) Fast Spatial Ancestry via Flexible Allele Frequency Surfaces. Bioinformatics, in press.
ConvertUnknownPEDData
for converting two Plink PED files (known and unknown)into a format appropriate for analysis,
FitOriGenModelFindUnknowns
for fitting allele surfaces to the converted data and finding the locations of the given unknown individuals,
PlotUnknownHeatMap
for a quick way to plot the resulting unknown heat map surfaces from FitOriGenModelFindUnknowns
,;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #Note that Plink files "10SNPs.ped", "10SNPs.map" and also "Locations.txt"
#are included in the data folder of the OriGen package with ".txt" appended to the Plink files.
#Please remove ".txt" and navigate to the appropriate location
#before testing the following commands.
#Note that this was done to allow inclusion of the test data in the package.
## Not run: trials3=ConvertUnknownPEDData("10SNPs","Locations.txt",""10SNPs"")
## Not run: str(trials3)
MaxGridLength=30
RhoParameter=10
## Not run: trials4=FitOriGenModelFindUnknowns(trials3$DataArray,trials3$SampleCoordinates,
trials3$UnknownData[1:2,],MaxGridLength,RhoParameter)
## End(Not run)
## Not run: PlotUnknownHeatMap(trials4,UnknownNumber=1,MaskWater=TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.