structure2bedassle | R Documentation |
structure2bedassle
converts a STRUCTURE dataset
to BEDASSLE format
structure2bedassle( infile, onerowperind, start.loci, start.samples = 1, missing.datum, prefix, save.freqs = TRUE )
infile |
The name and path of the file in STRUCTURE format
to be converted to the format used in a |
onerowperind |
Indicates whether the file format has
one row per individual ( |
start.loci |
The index of the first column in the dataset that contains genotype data. |
start.samples |
The index of the first row in the dataset that contains genotype data (e.g., after any headers). Default value is 1. |
missing.datum |
The character or value used to denote missing data in the STRUCTURE dataset (often 0 or -9). |
prefix |
A character |
save.freqs |
A logical value indicating whether or not to save the allele frequency data matrix generated by this function as an R object. |
This function takes a population genetics dataset in
STRUCTURE format and converts it to an allele frequency
data table, then calculates pairwise pi between all samples.
The matrix of pairwise pi can be used as the genDist
argument in run.bedassle
.
The STRUCTURE file can have one row per individual
and two columns per locus, or one column and two rows
per individual. It can only contain bi-allelic SNPs.
Missing data is acceptable, but must be indicated with
a single value throughout the dataset.
This function takes a STRUCTURE format data file and
converts it to a bedassle
format data file.
This function can only be applied to diploid organisms.
The STRUCTURE data file must be a plain text file.
If there are extraneous lines of text or column headers
before the data start, those extra lines should be deleted
by hand or taken into account via the start.samples
argument.
The STRUCTURE dataset can either be in the ONEROWPERIND=1 file format, with one row per individual and two columns per locus, or the ONEROWPERIND=0 format, with two rows and one column per individual. The first column of the STRUCTURE dataset should be individual names. There may be any number of other columns that contain non-genotype information before the first column that contains genotype data, but there can be no extraneous columns at the end of the dataset, after the genotype data.
The genotype data must be bi-allelic single nucleotide polymorphisms (SNPs). Applying this function to datasets with more than two alleles per locus may result in cryptic failure.
This function returns a matrix of pairwise pi
that can be used as the genDist
argument in a BEDASSLE
analysis (run.bedassle
). It also saves
this matrix as a text file ("yourprefix_pwp.txt") so that it can
be used in future analyses. If the save.freqs
is TRUE
,
the allele frequency data matrix generated from the STRUCTURE
data file is saved as an R data (.RData) object.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.