zeitzeigerBatch: Train and test a ZeitZeiger predictor, accounting for batch...

Description Usage Arguments Value See Also

View source: R/zeitzeiger_predict.R

Description

zeitzeigerBatch trains and tests a predictor on multiple datasets independently, using ComBat to correct for batch effects prior to running zeitzeiger. This function requires the metapredict package.

Usage

1
2
3
4
5
6
zeitzeigerBatch(ematList, trainStudyNames, sampleMetadata, studyColname,
  batchColname, timeColname, fitMeanArgs = list(rparm = NA, nknots = 3),
  constVar = TRUE, fitVarArgs = list(rparm = NA), nTime = 10,
  useSpc = TRUE, sumabsv = 2, orth = TRUE, nSpc = 2, betaSv = FALSE,
  timeRange = seq(0, 1, 0.01), covariateName = NA, featuresExclude = NULL,
  dopar = TRUE)

Arguments

ematList

Named list of matrices of measurements, one for each dataset, some of which will be for training, others for testing. Each matrix should have rownames corresponding to sample names and colnames corresponding to feature names.

trainStudyNames

Character vector of names in ematList corresponding to datasets for training.

sampleMetadata

data.frame containing relevant information for each sample across all datasets.

studyColname

Name of column in sampleMetdata that contains information about which dataset each sample belongs to.

batchColname

Name of column in sampleMetdata that contains information about which dataset each sample belongs to. This should correspond to the names of ematList, and will often be the same as studyColname, but doesn't have to be.

timeColname

Name of column in sampleMetdata that contains the values of the periodic variable.

fitMeanArgs

List of arguments to pass to bigspline for fitting mean of each SPC.

constVar

Logical indicating whether to assume constant variance as a function of the periodic variable.

fitVarArgs

List of arguments to pass to bigspline for fitting variance of each SPC. Unused if constVar==TRUE.

nTime

Number of time-points by which to discretize the time-dependent behavior of each feature. Corresponds to the number of rows in the matrix for which the SPCs will be calculated.

useSpc

Logical indicating whether to use SPC (default) or svd.

sumabsv

L1-constraint on the SPCs, passed to SPC.

orth

Logical indicating whether to require left singular vectors be orthogonal to each other, passed to SPC.

nSpc

Vector of the number of SPCs to use for prediction. If NA (default), nSpc will become 1:K, where K is the number of SPCs in spcResult. Each value in nSpc will correspond to one prediction for each test observation. A value of 2 means that the prediction will be based on the first 2 SPCs.

betaSv

Logical indicating whether to use the singular values of the SPCs as weights in the likelihood calculation.

timeRange

Vector of values of the periodic variable at which to calculate likelihood. The time with the highest likelihood is used as the initial value for the MLE optimizer.

covariateName

Name of column(s) in sampleMetadata containing information about other covariates for ComBat, besides batchColname. If NA (default), then there are no other covariates.

featuresExclude

Named list of character vectors corresponding to features to exclude from being used for prediction for the respective test datasets.

dopar

Logical indicating whether to process the folds in parallel. Use registerDoParallel to register the parallel backend.

Value

spcResultList

List of results from zeitzeigerSpc, one for each test dataset.

timeDepLike

3-D array of likelihood, with dimensions for each test observation (across all datasets), each element of nSpc, and each element of timeRange.

mleFit

List (for each element in nSpc) of lists (for each test observation) of mle2 objects.

timePred

Matrix of predicted times for test observations by values of nSpc.

See Also

zeitzeiger, metapredict, ComBat


jakejh/zeitzeiger documentation built on Nov. 22, 2017, 2:06 a.m.