variable_dist_per_site: Calculated distribution of variable per site

Description Usage Arguments Value Examples

Description

Calculates the distribution of values of a categorical variable per site from a table that contains one row per site per sample.

Usage

1
variable_dist_per_site(dat, variable, group = NULL)

Arguments

dat

A data frame or tibble containing columns "site_id", "ref_id" and "ref_pos". Each row must correspond to a site per sample.

variable

Column name of variable to evaluate. It must be a categorical variable.

group

If passed, it must correspond to a column name in dat. That column must be a grouping factor and the distribution will be calculated independently for each group.

Value

A tibble with columns "site_id", "ref_id", and "ref_pos". There will also be one column per level in 'variable', and, optionally, one column for 'group'.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
library(magrittr)
map <- readr::read_tsv(system.file("toy_example/map.txt",
                                   package = "HMVAR"),
                       col_types = readr::cols(ID = readr::col_character(),
                                               Group = readr::col_character())) %>%
  dplyr::select(sample = ID,
                tidyselect::everything())
Dat <- read_midas_data(midas_dir = system.file("toy_example/merged.snps/",
                                               package = "HMVAR"),
                       map = map,
                       cds_only = FALSE)

dat <- match_freq_and_depth(freq = Dat$freq,
                            depth = Dat$depth,
                            info = Dat$info,
                            map = map) %>%
  determine_sample_dist()
dat

surh/HMVAR documentation built on Aug. 18, 2021, 1:21 a.m.