calc_candidate_regions: Determine candidate regions of selection

Description Usage Arguments Details Value See Also

View source: R/calc_candidate_regions.R

Description

Determine candidate regions of selection.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
calc_candidate_regions(
  scan,
  threshold = NA,
  pval = FALSE,
  ignore_sign = FALSE,
  window_size = 1e+06,
  overlap = 0,
  right = TRUE,
  min_n_mrk = 1,
  min_n_extr_mrk = 1,
  min_perc_extr_mrk = 0,
  join_neighbors = TRUE
)

Arguments

scan

a data frame containing scores (output of ihh2ihs, ines2rsb or ies2xpehh).

threshold

boundary score above which markers are defined as "extreme".

pval

logical. If TRUE use the (negative log-) p-value instead of the score.

ignore_sign

logical. If TRUE (default), take absolute values of score.

window_size

size of sliding windows. If set to 1, no windows are constructed and only the individual extremal markers are reported.

overlap

size of window overlap (default 0, i.e. no overlap).

right

logical, indicating if the windows should be closed on the right (and open on the left) or vice versa.

min_n_mrk

minimum number of markers per window.

min_n_extr_mrk

minimum number of markers with extreme value in a window.

min_perc_extr_mrk

minimum percentage of extremal markers among all markers.

join_neighbors

logical. If TRUE (default), merge neighboring windows with extreme values.

Details

There is no generally agreed method how to determine genomic regions which might have been under recent selection. Since selection tends to yield clusters of markers with outlier values, a common approach is to search for regions with an elevated number or fraction of outlier or extremal markers. This function allows to set three conditions a window must fulfill in order to classify as candidate region:

"Extreme" markers are defined by having a score above the specified threshold.

Value

A data frame with chromosomal regions, i.e. windows that fulfill the necessary conditions to qualify as candidate regions under selection. For each region the overall number of markers, their mean and maximum, the number of markers with extremal values, their percentage of all markers and their average are reported.

See Also

calc_region_stats


rehh documentation built on Sept. 15, 2021, 5:06 p.m.