trotu: Classify OTU dataset

Description Usage Arguments Value References Examples

View source: R/trotu.R

Description

For each target variable pair, divide the given OTU dataset into "trva" (including training and validation set) and test sets, then find an optimal hyperparameter lambda through cross validation, estimate the covariance matrix from "trva" set and obtain the accuracy by classifying the test set.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
trotu(
  data,
  ...,
  thr = 0,
  target,
  pairs = FALSE,
  del.otu = TRUE,
  del.sam = TRUE,
  nvar = FALSE,
  lambda = seq(0.001, 0.3, by = 0.01),
  nsim = 25,
  seed = FALSE,
  nfold = 6,
  nsampling = 20,
  test.per = 0.2,
  norm.mode = 0,
  shrink.mode = 0,
  esti.mode = 0,
  cl.mode = 0
)

Arguments

data,

a list consisting of 2 data frames: otu, each column is an observation and each row is a serie of OTU counts for all observations; meta, each row contains all information for that observation

...,

filter condition for special subset of the given data

thr,

threshold for zombie OTU

target,

name for target variable

pairs,

character vector of specified target pairs, should be of length 2; otherwise if F (default value), it will classify all possible target pairs

del.otu,

default value of T; if T the zombie OTU will be deleted

del.sam,

default value of T; if T the zombie sample will be deleted

nvar,

number of OTUs after shrinking

lambda,

the range of lambda for cross validation, default value of seq(0.001,0.3,by=0.01)

nsim,

number of simulations, default value of 25

seed,

index of the seed, if F (default) no seed is set

nfold,

number of folds for cross validation, default value of 6

nsampling,

number of iterations for cross validation, default value of 20

test.per,

percentage of test dataset, default value of 0.2

norm.mode,

digits for normalization method, 0 for Kaul's method, 1 for Jun Li's

shrink.mode,

digits for shrinking method, 0 for Kaul's method, 1 for Jun Li's

esti.mode,

digits for covariance estimating method, 0 for Kaul's method, 1/2/3 for Jun Li's 1st/2nd/3rd method

cl.mode,

digits for classifying rule, 0 for the one in Kaul's software, 1 for the one in Kaul's paper

Value

a list consisting of following elements: ana0, analysis result after deleting zombie OTUs. A list of generated OTU tabel, reference OTUs and the name of OTUs ta, split the OTU table in "ana0" into different targets ta_norm, normalized OTU tables targets, vector of target names vs, pairs of targets ana1, a list of shrinked OTU table for each target, estimated covariance matrix "Sigma", estimated precision matrix "Omega", cross validation error, and accuracies

References

A. Kaul, O. Davidov and S. D. Peddada, "Structural zeroes in high-dimensional data with applications to microbiome studies", Biostatistics, vol. 18, no. 3, p. 422-433, 2017. Jun Li, "Classification of microbiome data with structural zeroes and small samples", master thesis at Link\"oping University, 2021

Examples

1
2
3
4
5
da<-simotu.gaus(50,700,3,nref=5,full.mean=10000,unif.min=0,unif.max=0.4,seed=1234) 
ha<-trotu(da,Target %in% c("target1","target2","target3"),
          thr=0,target="Target",pairs=c("target1","target2"),del.otu=FALSE,del.sam=TRUE,nvar=75,
          lambda=seq(0.001,0.3,by=0.01),nsim=3,seed=FALSE,nfold=5, nsampling=1, 
          test.per=0.2,norm.mode=1,shrink.mode=1,esti.mode=2,cl.mode=0)

yewei369/clotu documentation built on Dec. 23, 2021, 7:19 p.m.