import_omics_data: Import generic omics tsv or csv file
In jmw86069/platjam: Platform Jam, biological platform importers.

import_omics_data

R Documentation

Import generic omics tsv or csv file

Description

Import generic omics tab- or comma-delimited data file

Usage

import_omics_data(
  x,
  assay_name = "data",
  row_identifier = NULL,
  row_annotation_columns = NULL,
  curation_txt = NULL,
  verbose = FALSE,
  ...
)

Arguments

`x`	one of the following input types: `character` path to a data file suitable for import by `data.table::fread()`, where most delimiters will be recognized automatically. `data.frame` representing the equivalent data as from a file. `matrix` representing only the measured values, with rownames and colnames that represent respective identifiers.
`assay_name`	`character` string which will define the assay name where data is stored in the resulting `SummarizedExperiment` object.
`row_identifier`	defines the column, columns, or rownames, to be used as row identifiers. These values become `rownames(se)` for the output object. One of the following: `NULL` (default) which auto-detects an appropriate row identifier: If the first column is numeric, it uses rownames. Otherwise the first column is used. If there are duplicated values, they are made unique with `jamba::makeNames()`. `character` string of colname in the data `x` imported. `integer` column number of data `x` imported. The integer value `0` or `-1` to indicate `rownames(x)` One of these strings, to indicate `rownames(x)`: `"rownames"`, `"rowname"`, or `"row.names"`. The default is to use the first column in the imported data `x`.
`row_annotation_columns`	defines columns to retain in `rowData()` which have non-measurement data associated with each row. One of the following: `character` vector with one or more `colnames()` `integer` vector with one or more column numbers
`curation_txt`	either `data.frame` or `character` file path to tab- or comma-delimited file. The first column should match the column headers after importing data, `colData(se)`. Subsequent columns contain associated sample annotations. For Nanostring data, the Nanostring sample annotations will already be associated with the `colData(se)`, and `colnames(curation_df)` will overwrite any that already exist. Pro tip: The first column in `curation_txt` should contain `'.'` instead of punctuation/whitespace, to improve pattern matching filenames where the punctuation characters may have been modified during processing. Note also that when `curation_txt` is supplied, samples in `se` will be subset to include only those samples that match `curation_txt`, and in the order they appear in the `curation_txt` file. This behavior allows the `curation_txt` to be used to define the appropriate experimental ordering, which by default also defines downstream control factor levels for statistical contrasts. The first factor level is used as the control value in those contrasts.
`verbose`	`logical` indicating whether to print verbose output.
`...`	additional arguments are passed to internal functions for example `data.table::fread()`.

Value

SummarizedExperiment

Examples

x <- matrix(letters[1:9], ncol=3);
rownames(x) <- LETTERS[1:3];
colnames(x) <- letters[1:3];
x
import_omics_data(x)

x <- matrix(1:9, ncol=3);
rownames(x) <- LETTERS[1:3];
colnames(x) <- paste0(letters[1:3], "_rep1");
x

curation_txt <- data.frame(Pattern=LETTERS[1:3], Group="A", Batch="B")
se <- import_omics_data(x, curation_txt=curation_txt)
SummarizedExperiment::assays(se)[[1]]
SummarizedExperiment::colData(se)
SummarizedExperiment::rowData(se)

se2 <- import_omics_data(x, curation_txt=curation_txt[c(2, 1, 3), ])
SummarizedExperiment::assays(se2)[[1]]
SummarizedExperiment::colData(se2)

jmw86069/platjam documentation built on April 12, 2025, 1:41 p.m.