clean_data: Retrieve, Clean, and Format Input Data
In EliLillyCo/surfaltr: Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

clean_data

R Documentation

Retrieve, Clean, and Format Input Data

Description

This function cleans and formats input data. The cleaning and formatting portion involves removing any non-protein coding transcripts, removing any principal transcripts, and standardizing all column names. If the sequence is provided directly, the function also extracts the APPRIS annotation and UniProt IDs of each transcript from Ensembl. Provided data can follow 2 formats — the first option only contain transcript IDs and gene names and the second option contains a unique transcript identifier, gene names, and amino acid sequences. The function will return a data frame containing the transcript IDs, gene names, and APPRIS Annotation for each inputted transcript. If the amino acid sequence is included in the input data, this will also be included in the data frame. If only gene names and transcript IDS are provided, UniProt IDs will be included in the data frame.

Usage

clean_data(data_file, if_aa, organism)

Arguments

`data_file`	Path to the input file
`if_aa`	Boolean value indicating if the input file contains amino acid sequences with TRUE indicating that sequences are present and FALSE indicating that only IDs are present
`organism`	String indicating if the transcripts are from a human or a mouse

Value

A data frame containing gene names, transcript IDs, and APPRIS annotations for the given data. If sequences were provided, the data frame will also contain amino acid sequences. If only IDs were provided, the data frame will also contain the UniProt Swissprot ID, UniProt Swissprot isoform ID, and UniProt TREMBL ID.

EliLillyCo/surfaltr documentation built on May 3, 2022, 10:12 a.m.

EliLillyCo/surfaltr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EliLillyCo/surfaltr
Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

clean_data: Retrieve, Clean, and Format Input Data
In EliLillyCo/surfaltr: Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

Retrieve, Clean, and Format Input Data

Description

Usage

Arguments

Value

Related to clean_data in EliLillyCo/surfaltr...

R Package Documentation

Browse R Packages

We want your feedback!

EliLillyCo/surfaltr Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

clean_data: Retrieve, Clean, and Format Input Data In EliLillyCo/surfaltr: Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

Retrieve, Clean, and Format Input Data

Description

Usage

Arguments

Value

Related to clean_data in EliLillyCo/surfaltr...

R Package Documentation

Browse R Packages

We want your feedback!

EliLillyCo/surfaltr
Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

clean_data: Retrieve, Clean, and Format Input Data
In EliLillyCo/surfaltr: Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr