aggregate_duplicates-methods: Aggregates multiple counts from the same samples (e.g., from...

Description Usage Arguments Details Value Examples

Description

aggregate_duplicates() takes as input a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | and returns a 'tbl' with aggregated transcripts that were duplicated.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

## S4 method for signature 'spec_tbl_df'
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

## S4 method for signature 'tbl_df'
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

## S4 method for signature 'tidybulk'
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

## S4 method for signature 'SummarizedExperiment'
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

## S4 method for signature 'RangedSummarizedExperiment'
aggregate_duplicates(
  .data,
  .sample = NULL,
  .transcript = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE
)

Arguments

.data

A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> |

.sample

The name of the sample column

.transcript

The name of the transcript/gene column

.abundance

The name of the transcript/gene abundance column

aggregation_function

A function for counts aggregation (e.g., sum, median, or mean)

keep_integer

A boolean. Whether to force the aggregated counts to integer

Details

\lifecycle

maturing

This function aggregates duplicated transcripts (e.g., isoforms, ensembl). For example, we often have to convert ensembl symbols to gene/transcript symbol, but in doing so we have to deal with duplicates. 'aggregate_duplicates' takes a tibble and column names (as symbols; for 'sample', 'transcript' and 'count') as arguments and returns a tibble with aggregate transcript with the same name. All the rest of the column are appended, and factors and boolean are appended as characters.

Underlying custom method: data filter(n_aggr > 1) group_by(!!.sample,!!.transcript) dplyr::mutate(!!.abundance := !!.abundance

Value

A 'tbl' object with aggregated transcript abundance and annotation

A 'tbl' object with aggregated transcript abundance and annotation

A 'tbl' object with aggregated transcript abundance and annotation

A 'tbl' object with aggregated transcript abundance and annotation

A 'SummarizedExperiment' object

A 'SummarizedExperiment' object

Examples

1
2
3
4
5
6
7
    aggregate_duplicates(
    tidybulk::counts_mini,
    sample,
    transcript,
    `count`,
    aggregation_function = sum
    )

tidybulk documentation built on April 7, 2021, 6 p.m.