mlg: Create counts, vectors, and matrices of multilocus genotypes.

View source: R/mlg.r

mlgR Documentation

Create counts, vectors, and matrices of multilocus genotypes.

Description

Create counts, vectors, and matrices of multilocus genotypes.

Usage

mlg(gid, quiet = FALSE)

mlg.table(
  gid,
  strata = NULL,
  sublist = "ALL",
  exclude = NULL,
  blacklist = NULL,
  mlgsub = NULL,
  bar = TRUE,
  plot = TRUE,
  total = FALSE,
  color = FALSE,
  background = FALSE,
  quiet = FALSE
)

mlg.vector(gid, reset = FALSE)

mlg.crosspop(
  gid,
  strata = NULL,
  sublist = "ALL",
  exclude = NULL,
  blacklist = NULL,
  mlgsub = NULL,
  indexreturn = FALSE,
  df = FALSE,
  quiet = FALSE
)

mlg.id(gid)

Arguments

gid

a adegenet::genind, genclone, adegenet::genlight, or snpclone object.

quiet

Logical. If FALSE, progress of functions will be printed to the screen.

strata

a formula specifying the strata at which computation is to be performed.

sublist

a vector of population names or indices that the user wishes to keep. Default to "ALL".

exclude

a vector of population names or indexes that the user wishes to discard. Default to NULL.

blacklist

DEPRECATED, use exclude.

mlgsub

a vector of multilocus genotype indices with which to subset mlg.table and mlg.crosspop. NOTE: The resulting table from mlg.table will only contain countries with those MLGs

bar

deprecated. Same as plot. Retained for compatibility.

plot

logical If TRUE, a bar graph for each population will be displayed showing the relative abundance of each MLG within the population.

total

logical If TRUE, a row containing the sum of all represented MLGs is appended to the matrix produced by mlg.table.

color

an option to display a single barchart for mlg.table, colored by population (note, this becomes facetted if 'background = TRUE').

background

an option to display the the total number of MLGs across populations per facet in the background of the plot.

reset

logical. For genclone objects, the MLGs are defined by the input data, but they do not change if more or less information is added (i.e. loci are dropped). Setting 'reset = TRUE' will recalculate MLGs. Default is 'FALSE', returning the MLGs defined in the @mlg slot.

indexreturn

logical If TRUE, a vector will be returned to index the columns of mlg.table.

df

logical If TRUE, return a data frame containing the counts of the MLGs and what countries they are in. Useful for making graphs with ggplot.

Details

Multilocus genotypes are the unique combination of alleles across all loci. For details of how these are calculated see vignette("mlg", package = "poppr"). In short, for genind and genclone objects, they are calculated by using a rank function on strings of alleles, which is sensitive to missing data. For genlight and snpclone objects, they are calculated with distance methods via bitwise.dist and mlg.filter, which means that these are insensitive to missing data. Three different types of MLGs can be defined in poppr:

  • original the default definition of multilocus genotypes as detailed above

  • contracted these are multilocus genotypes collapsed into multilocus lineages (mll) with genetic distance via mlg.filter

  • custom user-defined multilocus genotypes. These are useful for information such as mycelial compatibility groups

All of the functions documented here will work on any of the MLG types defined in poppr

Value

mlg

an integer describing the number of multilocus genotypes observed.

mlg.table

a matrix with columns indicating unique multilocus genotypes and rows indicating populations. This table can be used with the funciton diversity_stats to calculate the Shannon-Weaver index (H), Stoddart and Taylor's index (aka inverse Simpson's index; G), Simpson's index (lambda), and evenness (E5).

mlg.vector

a numeric vector naming the multilocus genotype of each individual in the dataset.

mlg.crosspop

  • default a list where each element contains a named integer vector representing the number of individuals represented from each population in that MLG

  • indexreturn = TRUE a vector of integers defining the multilocus genotypes that have individuals crossing populations

  • df = TRUE A long form data frame with the columns: MLG, Population, Count. Useful for graphing with ggplot2

mlg.id

a list of multilocus genotypes with the associated individual names per MLG.

Note

The resulting matrix of 'mlg.table' can be used for analysis with the vegan package.

mlg.vector will recalculate the mlg vector for [adegenet::genind] objects and will return the contents of the mlg slot in [genclone][genclone-class] objects. This means that MLGs will be different for subsetted [adegenet::genind] objects.

Author(s)

Zhian N. Kamvar

See Also

vegan::diversity() diversity_stats popsub mll mlg.filter mll.custom

Examples


# Load the data set
data(Aeut)

# Investigate the number of multilocus genotypes.
amlg <- mlg(Aeut)
amlg # 119

# show the multilocus genotype vector 
avec <- mlg.vector(Aeut)
avec 

# Get a table
atab <- mlg.table(Aeut, color = TRUE)
atab

# See where multilocus genotypes cross populations
acrs <- mlg.crosspop(Aeut) # MLG.59: (2 inds) Athena Mt. Vernon

# See which individuals belong to each MLG
aid <- mlg.id(Aeut)
aid["59"] # individuals 159 and 57

## Not run: 

# For the mlg.table, you can also choose to display the number of MLGs across
# populations in the background

mlg.table(Aeut, background = TRUE)
mlg.table(Aeut, background = TRUE, color = TRUE)

# A simple example. 10 individuals, 5 genotypes.
mat1 <- matrix(ncol=5, 25:1)
mat1 <- rbind(mat1, mat1)
mat <- matrix(nrow=10, ncol=5, paste(mat1,mat1,sep="/"))
mat.gid <- df2genind(mat, sep="/")
mlg(mat.gid)
mlg.vector(mat.gid)
mlg.table(mat.gid)

# Now for a more complicated example.
# Data set of 1903 samples of the H3N2 flu virus genotyped at 125 SNP loci.
data(H3N2)
mlg(H3N2, quiet = FALSE)

H.vec <- mlg.vector(H3N2)

# Changing the population vector to indicate the years of each epidemic.
pop(H3N2) <- other(H3N2)$x$country
H.tab <- mlg.table(H3N2, plot = FALSE, total = TRUE)

# Show which genotypes exist accross populations in the entire dataset.
res <- mlg.crosspop(H3N2, quiet = FALSE)

# Let's say we want to visualize the multilocus genotype distribution for the
# USA and Russia
mlg.table(H3N2, sublist = c("USA", "Russia"), bar=TRUE)

# An exercise in subsetting the output of mlg.table and mlg.vector.
# First, get the indices of each MLG duplicated across populations.
inds <- mlg.crosspop(H3N2, quiet = FALSE, indexreturn = TRUE)

# Since the columns of the table from mlg.table are equal to the number of
# MLGs, we can subset with just the columns.
H.sub <- H.tab[, inds]

# We can also do the same by using the mlgsub flag.
H.sub <- mlg.table(H3N2, mlgsub = inds)

# We can subset the original data set using the output of mlg.vector to
# analyze only the MLGs that are duplicated across populations. 
new.H <- H3N2[H.vec %in% inds, ]


## End(Not run)

poppr documentation built on May 29, 2024, 5:54 a.m.