get_pairs: Create csv and fasta files containing information about pairs...

View source: R/get_pairs.R

get_pairsR Documentation

Create csv and fasta files containing information about pairs of transcripts

Description

This function processes the input data to retrieve information from ensembl and uniprot to generate a dataframe containing the gene names, transcript IDs, APPRIS annotations, and protein sequences for each pair of primary and alternative transcripts. Additionally, this function creates a fasta file with the transcript ID followed by the amino acid sequence for all inputted and associated primary transcripts. The file is organized so that all transcripts from a gene are next to each other. Finally, the function also produces a final table in csv form containing the gene names, transcript IDs, APPRIS annotations, and amino acid sequences for each transcript

Usage

get_pairs(data_file, if_aa = FALSE, organism = "human", temp = FALSE)

Arguments

data_file

Path to the input file

if_aa

Boolean value indicating if the input file contains amino acid sequences with TRUE indicating that sequences are present and FALSE indicating that only IDs are present

organism

String indicating if the transcripts are from a human or a mouse

temp

Boolean indicating if the fasta file should be deleted after the function finishes running or not. Recommended to always be set to FALSE.

Value

A data frame containing the gene names, transcript IDs, APPRIS annotations,and protein sequences for each pair of primary and alternative transcripts.

Note

This function also creates a fasta file containing the transcript IDs and associated amino acid sequences in the root directory. In addition to the fasta file, a csv file containing the returned dataframe is saved to the working directory.

Examples

tmhmm_folder_name <- "~/TMHMM2.0c"
if (check_tmhmm_install(tmhmm_folder_name)) {
    currwd <- getwd()
    AA_seq <- get_pairs(system.file("extdata", "crb1_example.csv",
        package = "surfaltr"
    ), TRUE, "mouse", TRUE)
    setwd(currwd)
}

EliLillyCo/surfaltr documentation built on May 3, 2022, 10:12 a.m.