HGDP.bedassle.data | R Documentation |
The allelic counts, sample sizes, geographic distances, ecological distances, and population metadata from the 38 human populations used in example BEDASSLE analyses, subsetted from the Human Genome Diversity Panel (HGDP) dataset.
data(HGDP.bedassle.data)
The format is: List of 7
int [1:38, 1:1000] 12 16 5 17 4 14 20 5 34 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:38] "Adygei" "Basque" "Italian" "French" ...
.. ..$ : chr [1:1000] "rs13287637" "rs17792496" "rs1968588" ...
int [1:38, 1:1000] 34 48 24 56 30 50 56 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:38] "Adygei" "Basque" "Italian" "French" ...
.. ..$ : chr [1:1000] "rs13287637" "rs17792496" "rs1968588" ...
num [1:38, 1:38] 0 1.187 0.867 1.101 1.247 ...
num [1:38, 1:38] 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:38] "1" "2" "3" "4" ...
.. ..$ : chr [1:38] "1" "2" "3" "4" ...
int 38
int 1000
'data.frame': 38 obs. of 3 variables:
chr [1:38] "Adygei" "Basque" "Italian" ...
chr [1:38] "44" "43" "46" "46" ...
chr [1:38] "39" "0" "10" "2" ...
A matrix of allelic count data, for which nrow =
the number of populations and ncol =
the number of bi-allelic loci
sampled. Each cell gives the number of times allele ‘1’ is observed in each
population. The choice of which allele is allele ‘1’ is arbitrary, but must
be consistent across all populations at a locus.
A matrix of sample sizes, for which nrow =
the number
of populations and ncol =
the number of bi-allelic loci sampled
(i.e. - the dimensions of sample.sizes
must match those of
counts
). Each cell gives the number of chromosomes successfully
genotyped at each locus in each population.
Pairwise geographic distance (D_{i,j}
). This may be
Euclidean, or, if the geographic scale of sampling merits it, great-circle
distance. In the case of this dataset, it is great-circle distance.
Pairwise ecological distance(s) (E_{i,j}
), which may
be continuous (e.g. - difference in elevation) or binary (same or opposite
side of some hypothesized barrier to gene flow). In this case, the
ecological distance is binary, representing whether a pair of populations
occurs on the same side, or on opposite sides, of the Himalayas.
The number of populations in the analysis.
This should be equal to nrow(
counts)
. In this dataset, there
are 38 populations sampled.
The number of loci in the analysis. This should be equal
to ncol(
counts)
. In this dataset, there are 1000 loci
sampled.
This data frame contains the metadata on the populations included in the analysis, including:
Population name
Latitude
Longitude
Conrad et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genetics 2008.
Li et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 2008.
Bradburd, G.S., Ralph, P.L., and Coop, G.M. Disentangling the effects of geographic and ecological isolation on genetic differentiation. Evolution 2013.
## see \command{MCMC}, \command{MCMC_BB}, \command{calculate.pariwise.Fst},
## \command{calculate.all.pairwise.Fst}, and \command{Covariance} for usage.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.