remove_redundancy: Drop redundant elements (e.g., samples) for which feature...

remove_redundancyR Documentation

Drop redundant elements (e.g., samples) for which feature (e.g., transcript/gene) aboundances are correlated

Description

remove_redundancy() takes as imput a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | for correlation method or | <DIMENSION 1> | <DIMENSION 2> | <...> | for reduced_dimensions method, and returns a 'tbl' with dropped elements (e.g., samples).

Usage

remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .value,
  method,
  of_samples = T,
  correlation_threshold = 0.9,
  log_transform = F,
  Dim_a_column,
  Dim_b_column
)

Arguments

.data

A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> |

.element

The name of the element column (normally samples).

.feature

The name of the feature column (normally transcripts/genes)

.value

The name of the column including the numerical value the clustering is based on (normally transcript abundance)

method

A character string. The cluster algorithm to use, ay the moment k-means is the only algorithm included.

of_samples

A boolean. In case the input is a tidysc object, it indicates Whether the element column will be sample or transcript column

correlation_threshold

A real number between 0 and 1. For correlation based calculation.

log_transform

A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)

Dim_a_column

A character string. For reduced_dimension based calculation. The column of one principal component

Dim_b_column

A character string. For reduced_dimension based calculation. The column of another principal component

Details

\lifecycle

experimental

This function removes redundant elements from the original data set (e.g., samples or transcripts). For example, if we want to define cell-type specific signatures with low sample redundancy. This function returns a tibble with dropped recundant elements (e.g., samples). Two redundancy estimation approaches are supported: (i) removal of highly correlated clusters of elements (keeping a representative) with method="correlation"; (ii) removal of most proximal element pairs in a reduced dimensional space.

Value

A tbl object with with dropped recundant elements (e.g., samples).

Examples





counts %>%
    remove_redundancy(
	   .element = sample,
	   .feature = transcript,
	   	.value =  count,
	   	method = "correlation"
	   	)

counts %>%
    remove_redundancy(
	   .element = sample,
	   .feature = transcript,
	   	.value = count,
	   	method = "reduced_dimensions"
	   	)



stemangiola/ttSc documentation built on Dec. 8, 2022, 2:37 a.m.