Compute Missing Rates of Biological Samples and Pooled Plasma Samples

Share:

Description

Computes two missing rates per metabolite: 1. Missing rate of biological samples and 2. Missing rate of pooled plasma samples. Requires a metabolomics data matrix from read.met function as well as the indicies of pooled plasma and biological samples from get_group. Returns a list with the two missing rates across all metabolites

Usage

1
get_missing(df, ppind, sampind)

Arguments

df

The metabolomics dataset, ideally read from the read.met function. Each column represents a sample and each row represents a metabolite. Columns should be labeled with some unique prefix denoting whether the column is from a biological sample or pooled plasma sample. For example, all pooled plasma samples may have columns identified by the prefix “PPP” and all biological samples may have columns identified by the prefix “X”. Missing data must be coded as NA. Columns must be ordered by injection order.

ppind

The indices of the pooled plasma samples.

sampind

The indices of the biological samples.

Value

A list with the missing rates of the pooled plasma samples and biological samples for all metabolites in dataframe. The keys are:

ppmiss

The pooled plasma missing rate for each metabolite

sampmiss

The biological sample missing rate for each metabolite

See Also

See MetProc-package for examples of running the full process.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(MetProc)

#Read metabolomics data
metdata <- read.met(system.file("extdata/sampledata.csv", package="MetProc"),
headrow=3, metidcol=1, fvalue=8, sep=",", ppkey="PPP", ippkey="BPP")

#Get groups based on samples and pooled plasma
grps <- get_group(metdata,'PPP','X') 

#Get the missing rates of each category for all metabolites
missrate <- get_missing(metdata,grps[['pp']],grps[['sid']])