Read in a Metabolomics Dataset of Standard Structure

Description

Read a metabolomics file. The file must be structured in a specific way. The columns of the file designate samples and the rows designate metabolites. The first n rows may contain any information. However, starting at row n+1 there must be a header line with column labels. The remaining rows are designated as one per metabolite. One column should contain the ID of each metabolite. Other columns can be included, but starting at some column, and continuously after this point, each sample or pooled plasma sample should be given its own column sorted by injection order. All pooled plasma columns should have a unique prefix differentiating them from biological samples. Up to 2 types of pooled plasma samples can be included in the file – each with a unique prefix. This may be useful when both a pooled plasma control generated from biological samples and a commercially available pooled plasma standard are used. All biological samples may have a designating prefix or simply lack a prefix designating pooled plasma samples. If no prefix designates the biological samples, a prefix of “X” will be used for biological samples in subsequent analysis. Missing data must be coded as NA.

Usage

1
2
read.met(data, headrow = 3, metidcol=1, fvalue=8, sep=",", ppkey='PPP',
ippkey = 'BPP', sidkey="none")

Arguments

data

The metabolomics dataset file. The columns of the file designate samples and the rows designate metabolites. The first n rows can contain any information. However, starting at row n+1 there must be a header line with column labels. The remaining rows are designated as one per metabolite. One column should contain the ID of each metabolite. Other columns can be included, but starting at some column, and continuously after this point, each biological sample or pooled plasma sample should be given it's own column sorted by injection order. All pooled plasma columns should have a unique prefix differentiating them from samples. Up to 2 types of pooled plasma samples can be included in the file – each with a unique prefix. All biological samples may have a designated prefix or simply lack the the prefix designating pooled plasma samples. If no prefix designates the biological samples, a prefix of “X” will be used for biological samples in subsequent analysis. Missing data must be coded as NA. See file sampledata for an example.

headrow

The row number that contains the header line. Default is 3.

metidcol

The column number that contains the metabolite ID. Default is 1.

fvalue

The column number where data begins. Default is 8.

sep

File delimiter. Default is ",".

ppkey

The unique prefix of biological sample-based pooled plasma columns. Default is "PPP".

ippkey

The unique prefix of standard pooled plasma columns. Default is "BPP".

sidkey

The unique prefix of biological samples in the csv file. If ‘none’ provided as value, any column that does not contain the prefix of ppkey or ippkey will be considered a biological sample and given the prefix ‘X’ for subsequent use. Default is "none".

Value

A matrix with the metabolomics data fully loaded. Should have the number of rows equal to the number of metabolites and columns equal to the number of samples + pooled plasma samples.

See Also

See MetProc-package for examples of running the full process.

Examples

1
2
3
4
5
library(MetProc)

#Read in metabolomics data
metdata <- read.met(system.file("extdata/sampledata.csv", package="MetProc"),
headrow=3, metidcol=1, fvalue=8, sep=",", ppkey="PPP", ippkey="BPP")