Description Usage Arguments Details Value Examples
View source: R/data_conversion.R
Takes all relevant information created in previous steps of data conversion pipeline, and combines into a single list which serves as input for further calculations
1 2 3 4 5 6 7 8 9 | list_diploid_params(
AC_list,
I_list,
PO,
coll_N,
RU_vec,
RU_starts,
alle_freq_prior = list(const_scaled = 1)
)
|
AC_list |
a list of allele count matrices; output from |
I_list |
a list of genotype vectors; output from |
PO |
a vector of collection (population of origin) indices
for every individual in the sample, in order identical to |
coll_N |
a vector of the total number of individuals in each collection, in order of appearance in the dataset |
RU_vec |
a vector of collection indices, sorted by reporting unit |
RU_starts |
a vector of indices, designating the first collection for each reporting unit in RU_vec |
alle_freq_prior |
a one-element named list specifying the prior to be used when
generating Dirichlet parameters for genotype likelihood calculations. The name of the
list item determines the type of prior used, with options |
Genotypes represented in I_list
are converted into a single long vector,
ordered by locus, individual, and gene copy, with NA
values represented as 0s.
Similarly, AC_list
is unlisted to AC
, ordered by locus, collection,
and allele. DP
is a list of Dirichlet priors for likelihood calculations, created
by adding the values calculated from alle_freq_prior
to each allele
sum_AC
and sum_DP
are the summed allele values for each locus
of their parent vectors, ordered by locus and collection.
list_diploid_params
returns a list of the information necessary
for the calculation of genotype likelihoods in MCMC:
L
, N
, and C
represent the number of loci, individual genotypes,
and collections, respectively. A
is a vector of the number of alleles at each
locus, and CA
is the cumulative sum of A
. coll
, coll_N
,
RU_vec
, and RU_starts
are copied directly from input.
I
, AC
, sum_AC
, DP
, and sum_DP
are vectorized
versions of data previously represented as lists and matrices; indexing macros
use L
, N
, C
, A
, and CA
to access these vectors
in later Rcpp-based calculations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | example(allelic_list)
PO <- as.integer(factor(ale_long$clean_short$collection))
coll_N <- as.vector(table(PO))
Colls_by_RU <- dplyr::count(ale_long$clean_short, repunit, collection) %>%
dplyr::filter(n > 0) %>%
dplyr::select(-n)
PC <- rep(0, length(unique((Colls_by_RU$repunit))))
for(i in 1:nrow(Colls_by_RU)) {
PC[Colls_by_RU$repunit[i]] <- PC[Colls_by_RU$repunit[i]] + 1
}
RU_starts <- c(0, cumsum(PC))
RU_vec <- as.integer(Colls_by_RU$collection)
param_list <- list_diploid_params(ale_ac, ale_alle_list, PO, coll_N, RU_vec, RU_starts)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.