add_homopolymer_length_when_indels: Add homopolymer lengths to a master table

View source: R/create_master_table.R

add_homopolymer_length_when_indelsR Documentation

Add homopolymer lengths to a master table

Description

This function adds to the master table a column with the lengths of homopolymers, from the reference fasta, that overlaps positions POS + 1, where POS is the position of a variant. POS + 1 makes sence because minimp2 places homopolymer indels to the left of homopolymers, and , in case of indels, the column POS of VCF files means the position immediately to the left of the indel. In this way, homopolymer lengths of SNPs are meaningless. Moreover, homopolymer lengths of variants that are heterozygous alternatives should be meaningless as well. That is because they could contain alleles that are SNPs.

Usage

add_homopolymer_length_when_indels(
  input_table,
  homopolymers,
  ouput_what = "length"
)

Arguments

input_table

A data.frame. The master table to add the new column.

homopolymers

A CompressedIRangesList object. It should store all homopolymers, it's nucleotive types and lengths, of the genome used as the reference to call the variants. It is gerated by the function 'sarlacc::homopolymers'.

ouput_what

A string equal to "length" (default) or "nts". If "length", the lengths of homopolymers are ouput (1 for non-homopolymers). If "nts", the nucleotide type is output (NA for non-homopolymers).

Value

A data.frame


vladimirsouza/lrRNAseqBenchmark documentation built on March 23, 2023, 7:32 a.m.