import_omics_data: Import generic omics tsv or csv file

import_omics_dataR Documentation

Import generic omics tsv or csv file

Description

Import generic omics tab- or comma-delimited data file

Usage

import_omics_data(
  x,
  assay_name = "data",
  row_identifier = NULL,
  row_annotation_columns = NULL,
  curation_txt = NULL,
  verbose = FALSE,
  ...
)

Arguments

x

one of the following input types:

  • character path to a data file suitable for import by data.table::fread(), where most delimiters will be recognized automatically.

  • data.frame representing the equivalent data as from a file.

  • matrix representing only the measured values, with rownames and colnames that represent respective identifiers.

assay_name

character string which will define the assay name where data is stored in the resulting SummarizedExperiment object.

row_identifier

defines the column, columns, or rownames, to be used as row identifiers. These values become rownames(se) for the output object. One of the following:

  • NULL (default) which auto-detects an appropriate row identifier:

    • If the first column is numeric, it uses rownames.

    • Otherwise the first column is used. If there are duplicated values, they are made unique with jamba::makeNames().

  • character string of colname in the data x imported.

  • integer column number of data x imported.

  • The integer value 0 or -1 to indicate rownames(x)

  • One of these strings, to indicate rownames(x): "rownames", "rowname", or "row.names".

  • The default is to use the first column in the imported data x.

row_annotation_columns

defines columns to retain in rowData() which have non-measurement data associated with each row. One of the following:

  • character vector with one or more colnames()

  • integer vector with one or more column numbers

curation_txt

either data.frame or character file path to tab- or comma-delimited file. The first column should match the column headers after importing data, colData(se). Subsequent columns contain associated sample annotations. For Nanostring data, the Nanostring sample annotations will already be associated with the colData(se), and colnames(curation_df) will overwrite any that already exist.

  • Pro tip: The first column in curation_txt should contain '.' instead of punctuation/whitespace, to improve pattern matching filenames where the punctuation characters may have been modified during processing.

  • Note also that when curation_txt is supplied, samples in se will be subset to include only those samples that match curation_txt, and in the order they appear in the curation_txt file. This behavior allows the curation_txt to be used to define the appropriate experimental ordering, which by default also defines downstream control factor levels for statistical contrasts. The first factor level is used as the control value in those contrasts.

verbose

logical indicating whether to print verbose output.

...

additional arguments are passed to internal functions for example data.table::fread().

Value

SummarizedExperiment

See Also

Other jam import functions: coverage_matrix2nmat(), deepTools_matrix2nmat(), frequency_matrix2nmat(), import_lipotype_csv(), import_metabolomics_niehs(), import_nanostring_csv(), import_nanostring_rcc(), import_nanostring_rlf(), import_proteomics_PD(), import_proteomics_mascot(), import_salmon_quant(), process_metab_compounds_file()

Examples

x <- matrix(letters[1:9], ncol=3);
rownames(x) <- LETTERS[1:3];
colnames(x) <- letters[1:3];
x
import_omics_data(x)

x <- matrix(1:9, ncol=3);
rownames(x) <- LETTERS[1:3];
colnames(x) <- paste0(letters[1:3], "_rep1");
x

curation_txt <- data.frame(Pattern=LETTERS[1:3], Group="A", Batch="B")
se <- import_omics_data(x, curation_txt=curation_txt)
SummarizedExperiment::assays(se)[[1]]
SummarizedExperiment::colData(se)
SummarizedExperiment::rowData(se)

se2 <- import_omics_data(x, curation_txt=curation_txt[c(2, 1, 3), ])
SummarizedExperiment::assays(se2)[[1]]
SummarizedExperiment::colData(se2)


jmw86069/platjam documentation built on Sept. 26, 2024, 3:31 p.m.