get_surr_bases: Retrieve surrounding base sequence of locus

Description Usage Arguments Details Value

Description

get_surr_bases retrieves the base sequence of the \[locus - k, locus + k\] region of each locus in mutation_ids from fasta_filename.

Usage

1
get_surr_bases(fasta_filename, mutation_ids, k)

Arguments

fasta_filename

character string naming the path to the reference genome FASTA file the sequencing data was aligned to.

mutation_ids

character vector containing the ids of the loci to get the surrounding bases of. Id format is CHR:POS and can be obtained by calling the function get_mut_id on the vcf file containing those loci.

k

integer with the number of bases to the right and to the left of the loci to get the genomic sequence from.

Details

The sequence of the surrounding bases is retrieved from fasta_filename using samtools faidx tool. In the process, get_surr_bases creates a temporary file necessary for this tool to run.

Value

Character vector with the genomic base sequence ranging from locus - k to locus + k for each locus in mutation_ids.


mmaitenat/ideafix documentation built on Sept. 18, 2021, 7:55 a.m.