import_metabolomics_niehs | R Documentation |
Import metabolomics data from NIEHS file formats
import_metabolomics_niehs(
data_path,
shared_samples_only = TRUE,
filter_sample_type = NULL,
curation_txt = NULL,
drop_na_columns = TRUE,
drop_na_values = c(NA, "NA", "not applicable", "not specified", "none"),
simplify_singlet_columns = TRUE,
verbose = FALSE,
...
)
data_path |
|
shared_samples_only |
|
filter_sample_type |
|
curation_txt |
|
drop_na_columns |
Note that columns with only one non-na value in all fields will
be removed from the |
drop_na_values |
|
simplify_singlet_columns |
|
verbose |
|
... |
additional arguments are ignored. |
This import function is specific to NIEHS file formats produced
from their defined analysis workflow. The files typically include
"df_pos"
for positive ionization, and "df_neg"
for negative
ionization.
Optionally, when the full data processing file is present,
it will be imported alongside the cleaned data described above.
The full data processing imports detailed compound measurement
data, and is expected in one of two formats in the data_path
folder:
Files "compounds_pos.txt"
and/or "compounds_neg.txt"
, or
"1_DataProcessed.zip"
which is expected to contain
files "compounds_pos.txt"
and/or "compounds_neg.txt"
in the archive.
The "compounds" data includes important annotations for each measurement, specifically the type of numeric measurement that is supplied by the upstream software. These annotations include whether numeric values were imputed, or measured directly in each sample.
"[project_code]_NIEHS_MCF_metadata.txt"
: tab-delimited text file
which contains sample annotations.
"df_pos.datamatrix.cleaned.txt"
: Tab-delimited text file
containing peak areas.
Features are processed and cleaned by MCF for quality.
"df_pos.datamatrix.cleaned.log10.txt"
: As above, but log10 transformed.
"df_pos.datamatrix.cleaned.rowsum.txt"
: As above but using
row sum peak areas.
"df_pos.annotation.cleaned.txt"
: annotation of each measured metabolite.
"df_neg.datamatrix.cleaned.txt"
: Tab-delimited text file
containing peak areas.
Features are processed and cleaned by MCF for quality.
"df_neg.datamatrix.cleaned.log10.txt"
: As above, but log10 transformed.
"df_neg.datamatrix.cleaned.rowsum.txt"
: As above but using
row sum peak areas.
"df_neg.annotation.cleaned.txt"
: annotation of each measured metabolite.
list
of SummarizedExperiment
objects, where the list
is defined by the type of ionization ("df_pos", "df_neg"), and
the type of data ("cleaned") in the data filenames.
Typically the result includes:
"df_pos_cleaned"
"df_neg_cleaned"
If only one ionization is provided, only one entry will be returned.
For each SummarizedExperiment
object:
rowData
represents metabolite annotations
colData
represents sample annotations, optionally including
annotations via a data.frame
supplied as curation_txt
.
the slot "metadata"
is a list
with the following:
isample_use
: the subset of colnames(se)
for which there
was sample metadata found in the metadata file. Some control
samples may not match the full metadata, and will be ignored
when using isamples_Use
.
irows_use
: all rownames(se)
for all measured metabolites.
irows_clean
: the rownames(se)
for measurement with no
annotation in the column "flag_guidance"
.
irows_flagged
: the rownames(se)
for measurements with
some non-empty annotation in the column "flag_guidance"
.
Other jam import functions:
coverage_matrix2nmat()
,
deepTools_matrix2nmat()
,
frequency_matrix2nmat()
,
import_lipotype_csv()
,
import_nanostring_csv()
,
import_nanostring_rcc()
,
import_nanostring_rlf()
,
import_omics_data()
,
import_proteomics_PD()
,
import_proteomics_mascot()
,
import_salmon_quant()
,
process_metab_compounds_file()
data_path <- path.expand("~/Projects/Rider/metabolomics_jul2023/data");
se_list <- import_metabolomics_niehs(data_path);
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.