correct_pos_idx_w_cigar: Adjust Position Indices in BAM Data with CIGAR Strings

View source: R/sequence_position_handling.R

correct_pos_idx_w_cigarR Documentation

Adjust Position Indices in BAM Data with CIGAR Strings

Description

This function takes a dataframe containing BAM file data and adjusts the position indices based on CIGAR strings. CIGAR strings in BAM files provide information about alignment of reads to the reference genome, including insertions, deletions, and padding. This function processes the CIGAR strings to correct the read positions accordingly, accounting for insertions (I), deletions (D), hard clips (H), and matches/mismatches (M). The adjusted position indices help in accurately mapping the reads back to the reference sequence.

Usage

correct_pos_idx_w_cigar(df)

Arguments

df

A dataframe containing BAM file data, which must include a column for CIGAR strings and a column for position indices (pos_idx).

Value

A modified dataframe with corrected position indices accounting for CIGAR string operations. The dataframe also includes additional columns that detail the size and index of insertions and deletions, along with a logical column indicating whether a position is within a deletion.


JakobPedersenLab/dreams documentation built on Feb. 2, 2024, 3:14 p.m.