merge_proteomics_se: merge proteomics SE objects
In jmw86069/platjam: Platform Jam, biological platform importers.

merge_proteomics_se

R Documentation

merge proteomics SE objects

Description

merge proteomics SE objects

Usage

merge_proteomics_se(
  SE1,
  SE2,
  rowname1 = "SYMBOL",
  rowname2 = "SYMBOL",
  rowData_colnames_intersect = TRUE,
  colData_colnames_intersect = TRUE,
  rowData_colnames_unique = c("percentCoverage", "numPepsUnique", "scoreUnique"),
  assay_names = NULL,
  se_names = c("A", "B"),
  startN = 2,
  verbose = TRUE,
  ...
)

Arguments

`SE1`, `SE2`	`SummarizedExperiment` objects to be merged into one output object.
`rowname1`, `rowname2`	`character` string that describes which `SummarizedExperiment::rowData()` annotation to use to create appropriate rownames to be merged. This approach is useful when merging data based upon gene symbol, instead of a protein accession or peptide sequence. The intent is to allow "equivalent" rows to be combined across `SE1` and `SE2`, while non-equivalent rows unique to `SE1` or `SE2` are represented on their own row. The default values assume each proteomics SE object contains a rowData column `"SYMBOL"` with the official gene symbol represented on each row. This column is appropriate if proteomics data already represents abundance measurements which were already aggregated to the protein-level (i.e. gene locus level). The data will therefore be merged based upon the gene symbol. In the event that multiple rows represent the same gene symbol, they will be renamed using `jamba::makeNames(..., renameFirst=FALSE)` so that the entries will be merged in order they appear in each dataset. However, if the input data contains peptide-level measurements, the appropriate column should contain the peptide sequence, so that the data is merged based upon equivalent peptide sequences. If `rowname1` or `rowname2` contain multiple values, and/or are not equal to each other, a new column `"merge_key"` is created in both `SE1` and `SE2`, and populated with relevant values. When multiple columns are indicated, they are concatenated using `jamba::pasteByRow()` to fill the column `"merge_key"`. Then both `rowname1` and `rowname2` are redefined to `"merge_key"`. Note that any pre-existing `"merge_key"` column will be overwritten. A combination of `"rownames"` and `colnames(rowData())` can be used. The argument value should contain one value from either: `colnames(rowData())` for the relevant object `SE1` or `SE2`, representing a row annotation to use as the merge key. Note that any empty values (`NA` or blank string `""`) will be replaced by existing `rownames()`. `"rownames"` to indicate that existing `rownames()` of the relevant object `SE1` or `SE2` should be used as the merge key. Note that if a column `"rownames"` already exists in `rowData()` it will be used as-is.
`rowData_colnames_intersect`, `colData_colnames_intersect`	`logical` indicating whether to retain only the intersection of `colnames(rowData())` and `colnames(colData())` in the output rowData and colData, respectively. `TRUE`: only the intersection is retained in the output data, default. `FALSE`: not yet implemented.
`rowData_colnames_unique`	`character` vector with optional `colnames(rowData())` which should be retained in a uniquely-named output column, to keep its values distinct between `SE1` and `SE2`. This argument is useful for something like `"score"` where independent datasets are expected to have unique values, and which may be important to compare. Note that columns not already being retained will be ignored.
`assay_names`	`character` vector with one or more specific assay names to retain in the output data. By default, all assay names are retained.
`se_names`	`character` vector length=2 to define the output labels used to indicate which rows and columns were present in `SE1` and `SE2`.
`startN`	`integer` number passed to `jamba::makeNames()` to define the suffix number for the first versioned output. Note that `renameFirst=FALSE` so the first occurrence of a `character` string will not be renamed. When `startN=2`, subsequent repeated entries will have suffix `"_v2"`, then `"_v3"` and so on.
`...`	additional arguments are passed to `jamba::makeNames()`.

Details

See notes for specific arguments for a description of how data is merged relative to rows and rowData(), columns and colData().

The general strategy is to merge equivalent rows to integrate rows across SE1 and SE2, but to force columns (sample measurements) to be unique across SE1 and SE2.

This process is somewhat similar to calling cbind(), in that the sample columns are extended. However, the rows are merged where possible.

No assay measurement values are lost during this process.

jmw86069/platjam
Platform Jam, biological platform importers.

merge_proteomics_se: merge proteomics SE objects
In jmw86069/platjam: Platform Jam, biological platform importers.

merge proteomics SE objects

Description

Usage

Arguments

Details

See Also

Related to merge_proteomics_se in jmw86069/platjam...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/platjam Platform Jam, biological platform importers.

merge_proteomics_se: merge proteomics SE objects In jmw86069/platjam: Platform Jam, biological platform importers.

merge proteomics SE objects

Description

Usage

Arguments

Details

See Also

Related to merge_proteomics_se in jmw86069/platjam...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/platjam
Platform Jam, biological platform importers.

merge_proteomics_se: merge proteomics SE objects
In jmw86069/platjam: Platform Jam, biological platform importers.