In eriqande/CohoBroodstock:

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "readme-figs/"
)

The goal of CohoBroodstock is to put a bunch of useful functions into one place to expedite Libby's coho broodstock management work.

Installing

If you don't already have the remotes package, then do:

install.packages("remotes")

You also have to get the related package installed from R-forge before you can install CohoBroodstock. That requires some gfortran compilation.

On Mac OSX, as of 2020-05-26 the installation procedure for CohoBroodstock goes like this:

Download and install the clang-7.0.0.pkg from https://cran.r-project.org/bin/macosx/tools/.
Download and install the gfortran-6.1.pkg from https://cran.r-project.org/bin/macosx/tools/.
The gfortran compiler does not get the gfortran executable properly into the PATH variable. So, once gfortran is installed, you have to put it in your PATH. One easy way to do that, if /usr/local/bin is already in your path, is:

ln -s /usr/local/gfortran/bin/gfortran /usr/local/bin

Install the related package like this:

install.packages("related", repos="http://R-Forge.R-project.org")

Finally, install CohoBroodstock from GitHub:

remotes::install_github("eriqande/CohoBroodstock")

Preparing Spawning matrices

This is set up now to use the related package to compute the Rxy's.
Input is a typical two-column format:

First column holds the individual IDs
Every two columns after that are one locus.
The file must have a row of column headers

The steps are:

first, make sure to load the package (and you might as well load the tidyverse too...) r library(tidyverse) library(CohoBroodstock)
read in the file and compute Rxy with computeRxy() ```r

get path to the example genotype file

(typically you would pass it the path to your own file)

geno_file <- system.file("extdata", "WSH_W1718_v5_two_column_data.txt.gz", package = "CohoBroodstock")

compute rxy. This returns a tibble

rxy <- computeRxy(geno_file) 2. prepare the spawning matrix from the ouput of the last command using spawning_matrix(). Like this:r spawning_matrix(Rxy_tidy = rxy) ```

This creates, by default, two files named spawn_matrix.csv and spawn_matrix_full.csv in the current working directory.

Actual vs Optimal vs Random Relatedness

Here is how it goes. We will show it from the genotype stage.

geno_path <- system.file("extdata/IGH_W1819_geno_aor.txt", package = "CohoBroodstock")

rxys <- computeRxy(geno_path)

Here is what the first few rows of that look like

rxys[1:10, ]

Then we need to read in the actual spawn pairs. This should be two columns: the first one named Female and then Male:

pairs_file <- system.file("extdata/IGH--W1819--actual_spawn_pairs.csv", package = "CohoBroodstock")
actual_pairs <- read_csv(pairs_file)

This looks like this:

actual_pairs[1:10, ]

Now, in order to use the function aor_pairs() we need to format the rxys tibble a little bit. We need to only keep the individuals starting with "F" and those starting with "M", we need to keep only comparisons between males and females, and we need to name the columns "Female", "Male", and "rxy". We do that with the clean_computeRxy_output() function.

rxy_clean <- clean_computeRxy_output(rxys)

The result looks like this:

rxy_clean[1:10,]

Before we feed these values int aor_pairs we have to remove Female F_7FN in actual_pairs because it is named incorrectly (I think: there is an F_07FN in the rxys file. aor_pairs barks an error about that.)

actual_pairs_corrected <- actual_pairs %>%
  filter(Female != "F_7FN")

Then we feed the actual values and all the values into the aor_pairs function.

set.seed(10)  # set a random number seed for reproducibility
AOR <- aor_pairs(actual_pairs_corrected, rxy_clean)

# have a look at it:
AOR

Then plot those values in a histogram:

cols <- c(Actual = "gold", Optimal = "limegreen", Random = "steelblue1")
ggplot(AOR, aes(x =  rxy, fill = `Spawn Pairs`)) +
  geom_histogram(position = "dodge", alpha = 0.75, binwidth = 0.03, color = "black", size = 0.2) +
  scale_fill_manual(values = cols)