View source: R/get_tissuegenes.R
get_tissuegenes | R Documentation |
Returns a series of gene lists for either all tissues in the GTEx transcriptomics dataset or for a specific tissue.
get_tissuegenes(
exprn_data,
in_genelist,
all_simple_paths,
in_tissue = NULL,
tpm_threshold = 1
)
exprn_data |
data frame. The GTEx TPM file having been read in as a data frame. |
in_genelist |
data frame. A table of all genes in the pathway including a column with kegg_id (hsa:<ENTREZ>) and external_gene_name (gene name) as defined by BioMart |
all_simple_paths |
list. The list of all "simple" linear paths through a pathway to a chosen end-point. |
in_tissue |
character. Either NULL (and output a gene list for all tissues in GTEx) or a single specific tissue (the name must match the file names for the tissue in GTEx). |
tpm_threshold |
numeric. Threshold defining whether a gene is expressed or not. Default = 1 |
This is a higher level function for creating one or multiple gene-lists for tissue-specific expression of genes. This is based on the GTEx Transcripts Per Million data (which can be found here: https://storage.googleapis.com/gtex_analysis_v8/rna_seq_data/GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_tpm.gct.gz). From this data we define a threshold (default = 1) constituting "above basal expression" of the gene. Any gene above this threshold is considered to be "expressed" in the defined tissue. The function then uses the linear paths provided from the smple_paths function and excludes any simple path which has at one or more genes with an expression level below the defined threshold (under the assumption that with one of the links in the chain missing, this is no longer a viable route for pathway functionality). Thus the output from this is not just only the genes which are expressed in a specific tissue, but also theoretically will exclude those genes which can only influence the desired end-point through genes which are defined as not being expressed within that tissue.
This function requires several inputs. It requires the GTEx TPM files, the input gene-list must come in the format of a column with kegg_id (this can be updated to be ENTREZ IDs) and external_gene_name (from BioMart). It also requires the output from smple_paths() under the assumption of having kept the connections between each gene and not just the names of the genes.
There default running option for this function provides the gene list for all tissues in the GTEx TPM files, however individual tissues can be selected. There is currently no option for selecting multiple or a list of tissues (this may be updated down the line).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.