merge_dart: Merge DArT files

merge_dartR Documentation

Merge DArT files

Description

This function allows to merge 2 DArT files (filtered of not).

Usage

merge_dart(
  dart1,
  strata1,
  dart2,
  strata2,
  filter.monomorphic = TRUE,
  filter.common.markers = TRUE,
  filename = NULL,
  remove.non.immortalized.dart.markers = TRUE,
  parallel.core = parallel::detectCores() - 1,
  ...
)

Arguments

dart1

Full path of the first DArT file.

strata1

Full path of the first strata file for dart1.

dart2

Full path of the second DArT file.

strata2

Full path of the second strata file for dart2.

filter.monomorphic

(optional, logical) Default: filter.monomorphic = TRUE.

filter.common.markers

(optional, logical) Default: filter.common.markers = TRUE.

filename

Name of the merged DArT file. By default, the function gives the merged data the filename:merge_dart with date and time appended. The function will also append the date and time to the filename provided. The data is written in the working directory. Default: filename = NULL.

remove.non.immortalized.dart.markers

(logical). By default the function will remove markers starting 1000, those are called non-immortalized markers by DArT. Default: remove.non.immortalized.dart.markers = TRUE.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

...

(optional) To pass further arguments for fine-tuning the function.

Details

The function average across markers the columns: CALL_RATE, REP_AVG, AVG_COUNT_REF and AVG_COUNT_SNP, when found in the data. For DArT, theses columns represent:

  • CALL_RATE: is the proportion of samples for which the genotype was called.

  • REP_AVG: is the proportion of technical replicate assay pairs for which the marker score is consistent.

  • AVG_COUND_REF and AVG_COUND_SNP: the mean coverage for the reference and alternate alleles, respectively.

The function removes markers with starting with 1000 that are not immortalized by DArT

When the argument common.markers is kept to TRUE, the function produces an UpSet plot to visualize the number of markers common or not between populations. The plot is not saved automatically, this as to be done manually by the user.

Value

The function returns a list in the global environment and 2 data frames in the working directory. The dataframes are the tidy dataset and a strata file of the 2 merged DArT files.

Author(s)

Thierry Gosselin thierrygosselin@icloud.com

Examples

## Not run: 
# The simplest way to run the function:
merged.data <- radiator::merge_dart(
dart1 = "bluefin_larvae.tsv", strata1 = "strata1_bft_larvae.tsv",
dart2 = "bluefin_adults.csv", strata2 = "strata2_bft_adults.tsv")

## End(Not run)

thierrygosselin/radiator documentation built on May 5, 2024, 5:12 a.m.