get_gt_sequences: Get genome and transcriptome sequences

View source: R/blast_primers.R

get_gt_sequencesR Documentation

Get genome and transcriptome sequences

Description

Get DNA sequences of all chromosomes and all annotated transcripts of a genome. This function is used to create the sequences in createBLASTDb.

Usage

get_gt_sequences(
  genome,
  annot = NULL,
  tx_id = "transcript_id",
  tx_name = "transcript_name",
  gene_name = "gene_name",
  gene_id = "gene_id",
  include_genome = TRUE,
  standard_chromosomes = TRUE
)

Arguments

genome

A BSgenome (or DNAStringSet) object containing the chromosome sequences to obtain genome and / or transcript sequences.

annot

A GRanges object containing all exons of transcripts to be considered. If not specified, no transcript sequences will be included in the output fasta file.

tx_id, tx_name, gene_name, gene_id

(character) Column names in annot metadata containing transcript id, transcript name, gene name and gene id information. These column are mandatory, but can contain internal names (e.g. "transcript-1" or "gene-1").

include_genome

(logical) Specifies whether the genome sequence should be included in the output fasta file.

standard_chromosomes

(logical) Specifies whether only standard chromosomes should be included in output genome sequences (e.g. chr1-22, chrX, chrY, chrM for homo sapiens).

compress

(logical) Create a gzipped output fasta file.

Value

A DNAStringSet object containing the genome and transcriptome sequences.


argschwind/TAPseq documentation built on Feb. 9, 2024, 8:20 p.m.