get_tmhmm: Create a data frame with the membrane locations of each amino...

View source: R/get_tmhmm.R

get_tmhmmR Documentation

Create a data frame with the membrane locations of each amino acid in a protein using TMHMM

Description

This function creates a data frame with columns containing transcript IDs and corresponding output from TMHMM. The TMHMM output includes a location for each amino acid, with O and o representing extracellular, M representing transmembrane, and i representing intracellular.

Usage

get_tmhmm(fasta_file_name, tmhmm_folder_name)

Arguments

fasta_file_name

Name of .fasta file containing amino acid sequences

tmhmm_folder_name

Full path to folder containing installed TMHMM 2.0 software. This path should end in TMHMM2.0c

Value

A data frame containing each transcript ID and the corresponding membrane location for each amino acid in its sequence formatted as a string

Note

In order for this function to work, there needs to be a .fasta file containing the amino acid sequences for each transcript called "AA.fasta" saved to a folder called output within the working directory. Additionally, the file saves a copy of the returned data frame in csv format to the output folder in the working directory.

Examples

tmhmm_folder_name <- "~/TMHMM2.0c"
if (check_tmhmm_install(tmhmm_folder_name)) {
    AA_seq <- get_pairs(system.file("extdata", "crb1_example.csv",
        package = "surfaltr"
    ), TRUE, "mouse", TRUE)
    topo <- get_tmhmm("AA.fasta", tmhmm_folder_name)
}

EliLillyCo/surfaltr documentation built on May 3, 2022, 10:12 a.m.