remove_redundancy-methods: Drop redundant elements (e.g., samples) for which feature...
In tidybulk: Brings transcriptomics to the tidyverse

Description Usage Arguments Value Examples

remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column,
  Dim_b_column
)

## S4 method for signature 'spec_tbl_df'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'tbl_df'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'tidybulk'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'SummarizedExperiment'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

## S4 method for signature 'RangedSummarizedExperiment'
remove_redundancy(
  .data,
  .element = NULL,
  .feature = NULL,
  .abundance = NULL,
  method,
  of_samples = TRUE,
  correlation_threshold = 0.9,
  top = Inf,
  log_transform = FALSE,
  Dim_a_column = NULL,
  Dim_b_column = NULL
)

`.data`	A 'tbl' formatted as \| <SAMPLE> \| <TRANSCRIPT> \| <COUNT> \| <...> \|
`.element`	The name of the element column (normally samples).
`.feature`	The name of the feature column (normally transcripts/genes)
`.abundance`	The name of the column including the numerical value the clustering is based on (normally transcript abundance)
`method`	A character string. The cluster algorithm to use, ay the moment k-means is the only algorithm included.
`of_samples`	A boolean. In case the input is a tidybulk object, it indicates Whether the element column will be sample or transcript column
`correlation_threshold`	A real number between 0 and 1. For correlation based calculation.
`top`	An integer. How many top genes to select for correlation based method
`log_transform`	A boolean, whether the value should be log-transformed (e.g., TRUE for RNA sequencing data)
`Dim_a_column`	A character string. For reduced_dimension based calculation. The column of one principal component
`Dim_b_column`	A character string. For reduced_dimension based calculation. The column of another principal component

A tbl object with with dropped redundant elements (e.g., samples).

A 'SummarizedExperiment' object

 tidybulk::counts_mini %>% 
 tidybulk(sample, transcript, count) %>% 
 identify_abundant() %>% 
   remove_redundancy(
	   .element = sample,
	   .feature = transcript,
	   	.abundance =  count,
	   	method = "correlation"
	   	)

counts.MDS = 
 tidybulk::counts_mini %>% 
 tidybulk(sample, transcript, count) %>% 
 identify_abundant() %>% 
  reduce_dimensions( method="MDS", .dims = 3)

remove_redundancy(
	counts.MDS,
	Dim_a_column = `Dim1`,
	Dim_b_column = `Dim2`,
	.element = sample,
  method = "reduced_dimensions"
)