gimme_PSI_expr_corr: Calculate correlation of one AS event PSI vs many genes...

View source: R/vst_utils.R

gimme_PSI_expr_corrR Documentation

Calculate correlation of one AS event PSI vs many genes expression levels (1 vs many).

Description

This function calculates the correlation between an alternative splicing event Percentage of Sequence Inclusion (PSI) with the expression levels (i.e. counts) of all genes present in the vast-tools expression table, after some basic filtering. The type of correlation can be either Spearman, Pearson, or Kendall. If using this type of correlation extra parameters can be passed to the cor() R function used to calculate the correlations. The mapping of ENSEMBL gene IDs to gene names is available only if selecting best correlating genes. This is done in order to avoid to query ENSEMBL bioMart servers with a massive gene list.

Usage

gimme_PSI_expr_corr(
  inclusion_tbl,
  vst_id,
  quality_thrshld = "N",
  vst_expression_tbl,
  min_mean_count = 5,
  corr_method = c("spearman", "pearson", "kendall"),
  num_genes = NULL,
  map_ID_2_names = FALSE,
  species,
  verbose = FALSE,
  ...
)

Arguments

inclusion_tbl

path to vast-tools inclusion table that contains a vst_id event.

vst_id

vast-tools alternative splicing event to grep in the inclusion_tbl.

quality_thrshld

vast-tools event quantification quality score threshold. Must be one of "N", "VLOW", "LOW", "OK", "SOK". For more info read the official documentation here under "Column 8, score 1".

vst_expression_tbl

Path to a vast-tools expression table (either cRPKM or TPM).

min_mean_count

Filter out low expressed genes in the table read from inclusion_tbl. Defines the minimum row mean expression value across all samples that a gene must have to be selected.

corr_method

Either spearman, pearson, or kendall passed to the function cor().

num_genes

Return only the top and bottom number of genes. Integer number greater or equal than 1. Default is NULL returning all genes.

map_ID_2_names

Logical. Whether or not to map the ENSEMBL gene IDs to gene names. Can be used only if num_genes is specified and the table contains ENSEMBL gene ID (check automatically).

species

Species character to use to map the ENSEMBL gene ID. Used by gimme_mart() to built a bioMaRt object. Default is guessed from vst_id.

verbose

Print out information

...

Extra parameters passed to cor() like use = "complete.obs".

Value

A tibble

Examples

# Return one PSI to all gene correlation
gimme_PSI_expr_corr(inclusion_tbl = psi_path, vst_id = "HsaEX0000001", 
                    vst_expression_tbl = expr_path, corr_method = "spearman", 
                    use = "complete.obs", verbose = TRUE ) -> corr_df
                    
# Return one PSI to all genes correlation filtered for the top and bottom correlating genes and map the IDs to names.
gimme_PSI_expr_corr(inclusion_tbl = psi_path, vst_id = "HsaEX0000001",
                    quality_thrshld = "VLOW", 
                    vst_expression_tbl = expr_path, min_mean_count = 100,
                    corr_method = "spearman", use = "complete.obs"
                    num_genes = 10, map_ID_2_names = T, species = "hsapiens",
                    verbose = TRUE ) -> best_corr_df                

Ni-Ar/niar documentation built on Feb. 3, 2025, 9:25 a.m.