determine_snp_dist: SNP distribution between sites

Description Usage Arguments Details Value Examples

View source: R/mktest.r

Description

Determines how snps distribute between sites. Requires output from midas_merge.py and a mapping file mapping samples to sites.

Usage

1
2
3
4
5
6
7
8
9
determine_snp_dist(
  info,
  freq,
  depth,
  map,
  depth_thres = 1,
  freq_thres = 0.5,
  clean = TRUE
)

Arguments

info

Data table corresponding to the 'snps_info.txt' file from MIDAS. Must have columns 'site_id' and 'sample'

freq

A data table corresponding to the 'snps_freq.txt' file from MIDAS. Must have a 'site_id' column, and one more column per sample. Each row is the frequency of the minor allele for the corresponding site in the corresponding sample.

depth

A data table corresponding to the 'snps_depth.txt' file from MIDAS. Must have a 'site_id' column, and one more column per sample. Each row is the sequencing depth for the corresponding site in the corresponding sample.

map

A data table associating samples with groups (sites). must have columns 'sample' and 'Group'.

depth_thres

Minimum number of reads (depth) at a site at a sample to be considered.

freq_thres

Frequency cuttoff for minor vs major allele. The value represents the distance from 0 or 1, for a site to be assigned to the major or minor allele respectively. It must be a value in [0,1].

clean

Whether to remove sites that had no valid distribution.

Details

Only samples in both the map and the depth and freq tables are considered. Everything else is removed (inner_join)

Value

A data table which is the same and info bnut with a 'distribution' column indicating the allele distribution between sites in the given samples.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
library(HMVAR)

# Get file paths
midas_dir <- system.file("toy_example/merged.snps/", package = "HMVAR")
map <- readr::read_tsv(system.file("toy_example/map.txt", package = "HMVAR"),
                       col_types = readr::cols(.default = readr::col_character())) %>%
  dplyr::select(sample = ID, Group)

# Read data
midas_data <- read_midas_data(midas_dir = midas_dir, map = map, cds_only = TRUE)

info <- determine_snp_effect(midas_data$info) %>%
  determine_snp_dist(freq = midas_data$freq,
                     depth = midas_data$depth, map = map,
                     depth_thres = 1, freq_thres = 0.5)
info

mktable <- info %>%
  split(.$gene_id) %>%
  purrr::map_dfr(mkvalues,
                 .id = "gene_id")
mktable

surh/HMVAR documentation built on Aug. 18, 2021, 1:21 a.m.